Scaledra is a navigation proxy, it allows to get a website pages code after the rendering of the layout engine and javascript execution. Many sites are dynamic and just downloading the page code doesn't provide the complete page code as opening it inside a standard browser.
Which is the engine behind Scaledra ?
It is based on an headless version of Webkit.
Scaledra has been developed with the following targets in mind:
Scalability: Accordingly to the workload the service must be able to scale out to guarantee the desired service level
Compatibility: It makes use of Webkit and its Javascript engine in order to have highest web compatibility level available today on the market
Portability: The service has been developed to run on both Microsoft Windows and GNU/Linux platforms
Distributable: Due to its modularity it can be deployed and orchestrated in geographic distributed environments
The project goal is a geographically distributed proxy infrastructure to navigate websites available for retrieval platforms. The retrieval infrastructure, via the Scaledra Active Proxy infrastructure, is able to navigate any website around the world avoiding the following problems:
Routing problems: In some cases websites are not reachable from a specific geographic location
Avoid ban: In case of frequent requests Scaledra is able to navigate the same site from a pool of different IP addresses making more difficult to find a navigation pattern
Avoid geographic filters: Some websites have been configured to be not accessible from specific countries