What are CSS attribute selectors and what's the syntax?
CSS attribute selectors match elements by an attribute’s presence or value, written inside square brackets. There are seven forms, plus an i flag for case-insensitive matching. The same syntax works in BeautifulSoup, Playwright, Puppeteer, and other scraping tools that accept CSS selectors.
[attr] /* element has the attribute, any value */
[attr="value"] /* exact match */
[attr~="value"] /* value is one whitespace-separated word in a list */
[attr|="value"] /* value is "value" or starts with "value-" (e.g. lang|="en" matches en-US) */
[attr^="value"] /* value starts with "value" */
[attr$="value"] /* value ends with "value" */
[attr*="value"] /* value contains "value" anywhere */
[attr="value" i] /* case-insensitive match (i flag before the closing bracket) */Practical use
Form inputs and links are the most common targets. Selecting by type or by attribute presence avoids extra state classes:
input[type="submit"] { background: #0070f3; color: #fff; }
input[required] { outline: 2px solid red; }
input[disabled] { opacity: 0.4; }
a[href^="http"] { /* external links */ }
a[href$=".pdf"] { color: firebrick; }
[data-status="active"] { border-left: 3px solid green; }Chained selectors act as AND conditions. input[type="radio"][checked] matches only checked radios. For scraping, this same chaining picks elements with no class hook, for example a[href*="/product/"][data-id] to grab product links that carry a data attribute.
Related articles
All articles →CSS Selectors Cheat Sheet: BS4, Scrapy, Selenium
CSS selectors from basic to :has() and :is(), with Python code for scraping and automation. Includes a library support table and resilient selector patterns.
XPath vs CSS: Why Web Scrapers Should Stop Listening to QA Testers
Use CSS Selectors for browser automation (clicks) and XPath for data extraction. See Python benchmarks proving XPath is faster in lxml.
How to Select Elements By Text in XPath?
Discover basic and advanced XPath techniques for selecting web elements by text, including contains(), text(), regular expressions, and more.
Write selectors against pages that actually loaded
HasData fetches blocked or rendered targets, so your selectors have something to query.