๐ Follow me on LinkedIn ๐
๐ OR on X/Twitter ๐
Courses for Data Nerds
==================================
๐ Google Data Analytics Certificate (START HERE) ๐๐ผ
๐ฟ SQL for Data Science ๐๐ผ
๐งพ Excel Skills for Business ๐๐ผย
๐ Python for Everybody ๐๐ผ
๐ Data Visualization with Tableau ๐๐ผย
๐ดโโ ๏ธ Data Science: Foundations using R ๐๐ผ
โ Coursera Plus Subscription (7-day free trial) ๐๐ผ
๐จ๐ผโ๐ซ All courses ๐๐ผ
Build a Portfolio
==================================
๐ฉ๐ปโ๐ปBuild portfolio here ๐๐ผ
Rebate Code: “LUKE”
My Portfolio ๐๐ผ
Books for Data Nerds
==================================
๐ Books Iโve read ๐๐ผ
๐ Data Analyst Must Read ๐๐ผ
๐ Tableau ๐๐ผ
๐ Power BI๐๐ผ
๐ Python ๐๐ผ
Tech for Data Nerds
==================================
โ๏ธ Tech I use ๐๐ผ
๐ชWindows on a Mac (Parallels VM) ๐๐ผ
๐จ๐ผโ๐ป M1 Macbook Air (Mac of choice) ๐๐ผ
๐ป Dell XPS 13 (PC of choice) ๐๐ผ
๐ป Asus Vivo Book (Lowest Cost PC) ๐๐ผ
๐ปLenovo IdeaPad (Best Value PC)๐๐ผ
Social Media / Contact Me
======================
๐๐ผโโ๏ธNewsletter:
๐ Instagram:
โฐ TikTok:
๐ Facebook:
๐ฅ Business Inquiries: luke@lukebarousse.com
As a member of the Amazon, Coursera, Hostinger, and Parallels Affiliate Programs, I earn a commission from qualifying purchases on the links above. It costs you nothing but helps me with content creation.
#dataanalyst #datascience
Link do Vรญdeo
Why didn't you just use proxies ?
Luke ๐ dude ๐ฎ bro๐
Another video where the title question never gets answered. Brilliant.
IDK if the bot you program have some sort of rate limiting or like a delay of 1sec between each request!!
But if you went through manually, it would be fine. But because you can do it quickly, itโs banned.
When you think you are the smartest ๐
So you were banned by applying the skills that those jobs require? Shouldn't you be hired?
You logged in. That's your mistake. You can't log in and scrape.
I don't understand why this is illegal or why anyone would even care. What's wrong with collecting data efficiently?
i think you are just scraping too fast. or collecting too much data from one ip, likely both. its like how there is a limit of 70 connections per day on linked in you just need to stay within the amount of data they allow you
Proxies ?
They have anti scraping measures now too. I mean the site basically useless if you dont scrape it because the search is literally dogwater and i found it was the only way to actually filter the results to get actually relevant jobs
So…just state it isn't illegal (in state law)
I think all these companies need to grow up and realize they are sending us paper catalogues with webpages. When we get the page we can do the fuck we want with it (privately)
"Publicly available"
What if we try to make a fast way to scrap manually data?
So they can collect our data anytime anywhere but we can't do the same?
Hi, Elon Musk just said data is being "aggressively scraped" how would twitter be measuring that?
Go through a public dataset manually
LinkedIn: ๐
Go through a public dataset with a bot
LinkedIn: ๐
You need to rotate ipโs and user agents to reduce chances of being caught and flagged as a bot
Me my question is : how did you do this web scraping stuff? I mean just show where to find the place to learn. I will dedicate 24h straight of my life to learn. I will be very happy. As a data neird, am going crazy of all of the flashy stuff on internet but with no value. Help me.
A few years ago I scraped data that was in the public domain, from websites around the world. I never had a problem with accessing the web pages. The problem was that the webpages changed. You had to constantly rewrite the scraping code, or change inputs to scraping tools. It might have cost less and reduced a lot of stress. Just by hiring low cost labor to manually input the data.
I don't get it … Its no different from a real person sitting there and copy pasting things all day. Or they want you to do it manually so you can suffer …
I already knew that thats why never tried with LinkedIn.
There are Github projects for that as well but doesnโt come with warranty.
Data viewed by the public on the internet via a privately owned corporate site does not necessarily equal public data.
Just use a proxy.
Because of that ToS, now i scraping data manually for my client, and it was pain in the arse. Lmao
Step one…build a web scraper that mimics human browsing behavior.
Or use APIs and get your data that way.
This was actually a project idea that I had for quite some time, to see job distribution in different states/countries, cross relate to salary by company from GlassDoor and all that, while researching, I discovered that there is an informal LinkedIn API, so you donโt actually need to scrape all the data, quite helpful
There are a bunch of articles on Medium about it too
I make a weather API. But now it give me an error like you have been blocked because we have registered an unusual ammount of traffic from your IP address.
So I can't finish my project because of this. How can I solve this issue
Scrape so fast, the backend crashes