Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

They really don't want people scraping their data with extensions. The LI API response is the worst tangled mess I've ever seen... It's so bad, I have to assume it's intentional. Took me 3 days to parse their responses. I had to build a special rules-based scraping engine which allows me to filter and map items layer by layer based the relative positions of those items with flexible rules. A bit like CSS selectors but more complicated.

The hard part is that some APIs return items in a different order or with different indentation so my engine normalizes all the variants into consistent objects.

It's quite impressive that LI works at all given the complexity.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: