idea algo to gradually split a blank document into blocks that make up a valid layout then train a model to predict reading order the reading order is actually related to the splitting