Skip to content

Used cars classifieds scraping

If you want a complete and accurate database of cars sold in your country, used cars classifieds websites are NOT a good place to source data from, because they are variable data, every day new cars are posted for sale while others become inactive, many car models are duplicated and some car models are missing because none is on sale today.

Quality databases with car specs can be made from websites collecting data from manufacturers, for example edmunds.com, parkers.co.uk, carwale.com, etc.

In 2016 someone asked me via this comment to scrap data from an used cars website and I initially said that is a stupidity to do this… I am sorry for this. Next year I noticed that there are several cases in which you need specifically to extract data from used cars listings, example:

  • You are doing an analysis of most popular car models, their average price, age, mileage, etc.
  • You are training an image recognition software, thus need thousands of amateur images labelled with car make and model. Used car classifieds websites are best option to get such images.
  • You want to start your own used cars website and can do this by copying listings from an existing website, with a link to original website so people can contact seller.

Someone asked me to provide a database of prices for used cars transactions. This is NOT possible because in most cases buyer and seller meet somewhere on street and negotiate price face to face, nobody knows at what price they agreed. There are also used cars dealers, but they are unlikely to share their sales records with third parties.

Websites scraped so far

Here you can buy SOME of the records I scraped in the past for various customers. If you want update (re-scraping) of any of below websites, or another website at your choice, please ask!

Cardekho.com-SAMPLE.xls
Carlist.my-SAMPLE.xlsx
Carmag.co.za-SAMPLE.xlsx

Note for telemarketers

Note for telemarketers who want email database or phone numbers to spam SMS ads: sorry but I may be not able to provide emails or phone numbers. All major classifieds websites use various anti-scraping features purposely to prevent users posting cars for sale to be spammed with unsolicited calls and emails.

For example phone numbers is generated to be displayed as an image, you need to click a button or solve a captcha to reveal seller phone number (certain anti-scraping features can be beaten, for example by using an OCR software, but this is difficult and I rather don’t bother doing this).

There may be some small websites that do not use any protection. Check yourself by right clicking page > view source, Ctrl+F and try to find phone number, if it is here, then I can easily scrap it.

1 comments

Leave a comment

Your email address will not be published. Required fields are marked *