Wednesday, September 18, 2019

Launch HN: Dashblock (YC S19) – Turn Any Website into an API https://ift.tt/30oARSZ

Launch HN: Dashblock (YC S19) – Turn Any Website into an API Hey HN, We're Hugues and Max, co-founders of Dashblock ( https://dashblock.com ). Dashblock turns any website into an API. People use us to access product information, news content, sales-related data or real-estate offers for instance. As a data scientist, Hugues realised how complicated it was to access web data programmatically when a website doesn't provide an API. You have to build a script to pull the HTML, render the page in some cases, find selectors for the information you are interested in, distribute your tasks to scale and if the structure of the page changes, you have to update your selectors to find back the information. We decided to build Dashblock to make it really simple to access web data through an API. Our software is basically a browser that allows you to access a website, right-click on the information you want to extract and preview your API on other pages. In order to create long-lasting APIs, we developed a machine learning model that is resilient to website updates. For now, we mainly handle changes at the level of the HTML structure but with enough training data, we will also be resilient to UI updates. Besides, our model detects similar content on the page to facilitate the selection process. When you call your API, we launch a headless browser, render the page, classify the content of the page using structural, visual and semantic features, and structure it by minimizing the entropy to give you a list when needed. Our pricing model is related to the number of API calls our users make per month and if you want to give it a try, we currently offer 10k API calls when you sign up! You can download our software here : dashblock.com. If you have any questions, we would be happy to answer them and if you have any related ideas, feedbacks or experiences, feel free to share them :) Thank you ! September 18, 2019 at 10:36AM

Labels:

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home