This doesnt really answer your question but hopefully gives some insight into our process.
The main bottlenecks were breaking the fisheye-style panoramas into different perspectives (so text was more readable), passing it to OCR and acquiring the panoramas as there isn't an official API.
Because of the above, we constrained ourselves from the outset. For example, the spacings between panoramas was 50m, we didnt traverse residential roads that were less likely to have signage, we only used the most recent panorama for a location etc
If I interpret global as without those constraints (5m spacings, every road, all historic panoramas) then I think the first problem you'll run into is being rate limited by Google. Compute may be able to solve the other problems but it would be very expensive.
Search all text in New York City - https://news.ycombinator.com/item?id=44883304 - Aug 2025 (116 comments)
All text in Brooklyn - https://news.ycombinator.com/item?id=41344245 - Aug 2024 (50 comments)
This doesnt really answer your question but hopefully gives some insight into our process.
The main bottlenecks were breaking the fisheye-style panoramas into different perspectives (so text was more readable), passing it to OCR and acquiring the panoramas as there isn't an official API.
Because of the above, we constrained ourselves from the outset. For example, the spacings between panoramas was 50m, we didnt traverse residential roads that were less likely to have signage, we only used the most recent panorama for a location etc
If I interpret global as without those constraints (5m spacings, every road, all historic panoramas) then I think the first problem you'll run into is being rate limited by Google. Compute may be able to solve the other problems but it would be very expensive.