This repository will host raw and polished data sets about mortgages on commercially operated buildings in New York City except Staten Island. Here is an excerpt of the text data.
Roadmap:
- Commercial real estate with retail space: We have downloaded, processed, and OCRed more than 7 million pages or 550GB worth of image documents on the mortgages of more than 30,000 properties.
- Entity extraction: We trained an AI to read the "Mortgage Schedule" pages.
- Convert list of entities to episodic data
- Extend to all other commercial real estate: more than 60,000 properties.