lxml.cssselect module
CSS Selectors based on XPath.
This module supports selecting XML/HTML tags based on CSS selectors. See the CSSSelector class for details.
This is a thin wrapper around cssselect 0.7 or later.
- class lxml.cssselect.CSSSelector(css, namespaces=None, translator='xml')[source]
Bases:
XPath
A CSS selector.
Usage:
>>> from lxml import etree, cssselect >>> select = cssselect.CSSSelector("a tag > child") >>> root = etree.XML("<a><b><c/><tag><child>TEXT</child></tag></b></a>") >>> [ el.tag for el in select(root) ] ['child']
To use CSS namespaces, you need to pass a prefix-to-namespace mapping as
namespaces
keyword argument:>>> rdfns = 'http://www.w3.org/1999/02/22-rdf-syntax-ns#' >>> select_ns = cssselect.CSSSelector('root > rdf|Description', ... namespaces={'rdf': rdfns}) >>> rdf = etree.XML(( ... '<root xmlns:rdf="%s">' ... '<rdf:Description>blah</rdf:Description>' ... '</root>') % rdfns) >>> [(el.tag, el.text) for el in select_ns(rdf)] [('{http://www.w3.org/1999/02/22-rdf-syntax-ns#}Description', 'blah')]
- evaluate(self, _eval_arg, **_variables)
Evaluate an XPath expression.
Instead of calling this method, you can also call the evaluator object itself.
Variables may be provided as keyword arguments. Note that namespaces are currently not supported for variables.
- Deprecated:
call the object, not its method.
- error_log
- path
The literal XPath expression.
- class lxml.cssselect.LxmlHTMLTranslator(xhtml: bool = False)[source]
Bases:
LxmlTranslator
,HTMLTranslator
lxml extensions + HTML support.
- xpathexpr_cls
alias of
XPathExpr
- css_to_xpath(css: str, prefix: str = 'descendant-or-self::') str
Translate a group of selectors to XPath.
Pseudo-elements are not supported here since XPath only knows about “real” elements.
- Parameters:
css – A group of selectors as a string.
prefix – This string is prepended to the XPath expression for each selector. The default makes selectors scoped to the context node’s subtree.
- Raises:
SelectorSyntaxError
on invalid selectors,ExpressionError
on unknown/unsupported selectors, including pseudo-elements.- Returns:
The equivalent XPath 1.0 expression as a string.
- pseudo_never_matches(xpath: XPathExpr) XPathExpr
Common implementation for pseudo-classes that never match.
- selector_to_xpath(selector: Selector, prefix: str = 'descendant-or-self::', translate_pseudo_elements: bool = False) str
Translate a parsed selector to XPath.
- Parameters:
selector – A parsed
Selector
object.prefix – This string is prepended to the resulting XPath expression. The default makes selectors scoped to the context node’s subtree.
translate_pseudo_elements – Unless this is set to
True
(ascss_to_xpath()
does), thepseudo_element
attribute of the selector is ignored. It is the caller’s responsibility to reject selectors with pseudo-elements, or to account for them somehow.
- Raises:
ExpressionError
on unknown/unsupported selectors.- Returns:
The equivalent XPath 1.0 expression as a string.
- xpath(parsed_selector: Element | Hash | Class | Function | Pseudo | Attrib | Negation | Relation | Matching | SpecificityAdjustment | CombinedSelector) XPathExpr
Translate any parsed selector object.
- xpath_active_pseudo(xpath: XPathExpr) XPathExpr
Common implementation for pseudo-classes that never match.
- xpath_attrib(selector: Attrib) XPathExpr
Translate an attribute selector.
- xpath_attrib_dashmatch(xpath: XPathExpr, name: str, value: str | None) XPathExpr
- xpath_attrib_different(xpath: XPathExpr, name: str, value: str | None) XPathExpr
- xpath_attrib_equals(xpath: XPathExpr, name: str, value: str | None) XPathExpr
- xpath_attrib_exists(xpath: XPathExpr, name: str, value: str | None) XPathExpr
- xpath_attrib_includes(xpath: XPathExpr, name: str, value: str | None) XPathExpr
- xpath_attrib_prefixmatch(xpath: XPathExpr, name: str, value: str | None) XPathExpr
- xpath_attrib_substringmatch(xpath: XPathExpr, name: str, value: str | None) XPathExpr
- xpath_attrib_suffixmatch(xpath: XPathExpr, name: str, value: str | None) XPathExpr
- xpath_checked_pseudo(xpath: XPathExpr) XPathExpr
Common implementation for pseudo-classes that never match.
- xpath_child_combinator(left: XPathExpr, right: XPathExpr) XPathExpr
right is an immediate child of left
- xpath_class(class_selector: Class) XPathExpr
Translate a class selector.
- xpath_combinedselector(combined: CombinedSelector) XPathExpr
Translate a combined selector.
- xpath_contains_function(xpath, function)
- xpath_descendant_combinator(left: XPathExpr, right: XPathExpr) XPathExpr
right is a child, grand-child or further descendant of left
- xpath_direct_adjacent_combinator(left: XPathExpr, right: XPathExpr) XPathExpr
right is a sibling immediately after left
- xpath_disabled_pseudo(xpath: XPathExpr) XPathExpr
Common implementation for pseudo-classes that never match.
- xpath_element(selector: Element) XPathExpr
Translate a type or universal selector.
- xpath_empty_pseudo(xpath: XPathExpr) XPathExpr
- xpath_enabled_pseudo(xpath: XPathExpr) XPathExpr
Common implementation for pseudo-classes that never match.
- xpath_first_child_pseudo(xpath: XPathExpr) XPathExpr
- xpath_first_of_type_pseudo(xpath: XPathExpr) XPathExpr
- xpath_focus_pseudo(xpath: XPathExpr) XPathExpr
Common implementation for pseudo-classes that never match.
- xpath_function(function: Function) XPathExpr
Translate a functional pseudo-class.
- xpath_hash(id_selector: Hash) XPathExpr
Translate an ID selector.
- xpath_hover_pseudo(xpath: XPathExpr) XPathExpr
Common implementation for pseudo-classes that never match.
- xpath_indirect_adjacent_combinator(left: XPathExpr, right: XPathExpr) XPathExpr
right is a sibling after left, immediately or not
- xpath_lang_function(xpath: XPathExpr, function: Function) XPathExpr
- xpath_last_child_pseudo(xpath: XPathExpr) XPathExpr
- xpath_last_of_type_pseudo(xpath: XPathExpr) XPathExpr
- xpath_link_pseudo(xpath: XPathExpr) XPathExpr
Common implementation for pseudo-classes that never match.
- static xpath_literal(s: str) str
- xpath_matching(matching: Matching) XPathExpr
- xpath_negation(negation: Negation) XPathExpr
- xpath_nth_child_function(xpath: XPathExpr, function: Function, last: bool = False, add_name_test: bool = True) XPathExpr
- xpath_nth_last_child_function(xpath: XPathExpr, function: Function) XPathExpr
- xpath_nth_last_of_type_function(xpath: XPathExpr, function: Function) XPathExpr
- xpath_nth_of_type_function(xpath: XPathExpr, function: Function) XPathExpr
- xpath_only_child_pseudo(xpath: XPathExpr) XPathExpr
- xpath_only_of_type_pseudo(xpath: XPathExpr) XPathExpr
- xpath_pseudo(pseudo: Pseudo) XPathExpr
Translate a pseudo-class.
- xpath_pseudo_element(xpath: XPathExpr, pseudo_element: FunctionalPseudoElement | str) XPathExpr
Translate a pseudo-element.
Defaults to not supporting pseudo-elements at all, but can be overridden by sub-classes.
- xpath_relation(relation: Relation) XPathExpr
- xpath_relation_child_combinator(left: XPathExpr, right: XPathExpr) XPathExpr
right is an immediate child of left; select left
- xpath_relation_descendant_combinator(left: XPathExpr, right: XPathExpr) XPathExpr
right is a child, grand-child or further descendant of left; select left
- xpath_relation_direct_adjacent_combinator(left: XPathExpr, right: XPathExpr) XPathExpr
right is a sibling immediately after left; select left
- xpath_relation_indirect_adjacent_combinator(left: XPathExpr, right: XPathExpr) XPathExpr
right is a sibling after left, immediately or not; select left
- xpath_root_pseudo(xpath: XPathExpr) XPathExpr
- xpath_scope_pseudo(xpath: XPathExpr) XPathExpr
- xpath_specificityadjustment(matching: SpecificityAdjustment) XPathExpr
- xpath_target_pseudo(xpath: XPathExpr) XPathExpr
Common implementation for pseudo-classes that never match.
- xpath_visited_pseudo(xpath: XPathExpr) XPathExpr
Common implementation for pseudo-classes that never match.
- attribute_operator_mapping = {'!=': 'different', '$=': 'suffixmatch', '*=': 'substringmatch', '=': 'equals', '^=': 'prefixmatch', 'exists': 'exists', '|=': 'dashmatch', '~=': 'includes'}
- combinator_mapping = {' ': 'descendant', '+': 'direct_adjacent', '>': 'child', '~': 'indirect_adjacent'}
- id_attribute = 'id'
The attribute used for ID selectors depends on the document language: http://www.w3.org/TR/selectors/#id-selectors
- lang_attribute = 'lang'
The attribute used for
:lang()
depends on the document language: http://www.w3.org/TR/selectors/#lang-pseudo
- lower_case_attribute_names = False
- lower_case_attribute_values = False
- lower_case_element_names = False
The case sensitivity of document language element names, attribute names, and attribute values in selectors depends on the document language. http://www.w3.org/TR/selectors/#casesens
When a document language defines one of these as case-insensitive, cssselect assumes that the document parser makes the parsed values lower-case. Making the selector lower-case too makes the comparaison case-insensitive.
In HTML, element names and attributes names (but not attribute values) are case-insensitive. All of lxml.html, html5lib, BeautifulSoup4 and HTMLParser make them lower-case in their parse result, so the assumption holds.
- class lxml.cssselect.LxmlTranslator[source]
Bases:
GenericTranslator
A custom CSS selector to XPath translator with lxml-specific extensions.
- xpathexpr_cls
alias of
XPathExpr
- css_to_xpath(css: str, prefix: str = 'descendant-or-self::') str
Translate a group of selectors to XPath.
Pseudo-elements are not supported here since XPath only knows about “real” elements.
- Parameters:
css – A group of selectors as a string.
prefix – This string is prepended to the XPath expression for each selector. The default makes selectors scoped to the context node’s subtree.
- Raises:
SelectorSyntaxError
on invalid selectors,ExpressionError
on unknown/unsupported selectors, including pseudo-elements.- Returns:
The equivalent XPath 1.0 expression as a string.
- pseudo_never_matches(xpath: XPathExpr) XPathExpr
Common implementation for pseudo-classes that never match.
- selector_to_xpath(selector: Selector, prefix: str = 'descendant-or-self::', translate_pseudo_elements: bool = False) str
Translate a parsed selector to XPath.
- Parameters:
selector – A parsed
Selector
object.prefix – This string is prepended to the resulting XPath expression. The default makes selectors scoped to the context node’s subtree.
translate_pseudo_elements – Unless this is set to
True
(ascss_to_xpath()
does), thepseudo_element
attribute of the selector is ignored. It is the caller’s responsibility to reject selectors with pseudo-elements, or to account for them somehow.
- Raises:
ExpressionError
on unknown/unsupported selectors.- Returns:
The equivalent XPath 1.0 expression as a string.
- xpath(parsed_selector: Element | Hash | Class | Function | Pseudo | Attrib | Negation | Relation | Matching | SpecificityAdjustment | CombinedSelector) XPathExpr
Translate any parsed selector object.
- xpath_active_pseudo(xpath: XPathExpr) XPathExpr
Common implementation for pseudo-classes that never match.
- xpath_attrib(selector: Attrib) XPathExpr
Translate an attribute selector.
- xpath_attrib_dashmatch(xpath: XPathExpr, name: str, value: str | None) XPathExpr
- xpath_attrib_different(xpath: XPathExpr, name: str, value: str | None) XPathExpr
- xpath_attrib_equals(xpath: XPathExpr, name: str, value: str | None) XPathExpr
- xpath_attrib_exists(xpath: XPathExpr, name: str, value: str | None) XPathExpr
- xpath_attrib_includes(xpath: XPathExpr, name: str, value: str | None) XPathExpr
- xpath_attrib_prefixmatch(xpath: XPathExpr, name: str, value: str | None) XPathExpr
- xpath_attrib_substringmatch(xpath: XPathExpr, name: str, value: str | None) XPathExpr
- xpath_attrib_suffixmatch(xpath: XPathExpr, name: str, value: str | None) XPathExpr
- xpath_checked_pseudo(xpath: XPathExpr) XPathExpr
Common implementation for pseudo-classes that never match.
- xpath_child_combinator(left: XPathExpr, right: XPathExpr) XPathExpr
right is an immediate child of left
- xpath_class(class_selector: Class) XPathExpr
Translate a class selector.
- xpath_combinedselector(combined: CombinedSelector) XPathExpr
Translate a combined selector.
- xpath_descendant_combinator(left: XPathExpr, right: XPathExpr) XPathExpr
right is a child, grand-child or further descendant of left
- xpath_direct_adjacent_combinator(left: XPathExpr, right: XPathExpr) XPathExpr
right is a sibling immediately after left
- xpath_disabled_pseudo(xpath: XPathExpr) XPathExpr
Common implementation for pseudo-classes that never match.
- xpath_element(selector: Element) XPathExpr
Translate a type or universal selector.
- xpath_empty_pseudo(xpath: XPathExpr) XPathExpr
- xpath_enabled_pseudo(xpath: XPathExpr) XPathExpr
Common implementation for pseudo-classes that never match.
- xpath_first_child_pseudo(xpath: XPathExpr) XPathExpr
- xpath_first_of_type_pseudo(xpath: XPathExpr) XPathExpr
- xpath_focus_pseudo(xpath: XPathExpr) XPathExpr
Common implementation for pseudo-classes that never match.
- xpath_function(function: Function) XPathExpr
Translate a functional pseudo-class.
- xpath_hash(id_selector: Hash) XPathExpr
Translate an ID selector.
- xpath_hover_pseudo(xpath: XPathExpr) XPathExpr
Common implementation for pseudo-classes that never match.
- xpath_indirect_adjacent_combinator(left: XPathExpr, right: XPathExpr) XPathExpr
right is a sibling after left, immediately or not
- xpath_lang_function(xpath: XPathExpr, function: Function) XPathExpr
- xpath_last_child_pseudo(xpath: XPathExpr) XPathExpr
- xpath_last_of_type_pseudo(xpath: XPathExpr) XPathExpr
- xpath_link_pseudo(xpath: XPathExpr) XPathExpr
Common implementation for pseudo-classes that never match.
- static xpath_literal(s: str) str
- xpath_matching(matching: Matching) XPathExpr
- xpath_negation(negation: Negation) XPathExpr
- xpath_nth_child_function(xpath: XPathExpr, function: Function, last: bool = False, add_name_test: bool = True) XPathExpr
- xpath_nth_last_child_function(xpath: XPathExpr, function: Function) XPathExpr
- xpath_nth_last_of_type_function(xpath: XPathExpr, function: Function) XPathExpr
- xpath_nth_of_type_function(xpath: XPathExpr, function: Function) XPathExpr
- xpath_only_child_pseudo(xpath: XPathExpr) XPathExpr
- xpath_only_of_type_pseudo(xpath: XPathExpr) XPathExpr
- xpath_pseudo(pseudo: Pseudo) XPathExpr
Translate a pseudo-class.
- xpath_pseudo_element(xpath: XPathExpr, pseudo_element: FunctionalPseudoElement | str) XPathExpr
Translate a pseudo-element.
Defaults to not supporting pseudo-elements at all, but can be overridden by sub-classes.
- xpath_relation(relation: Relation) XPathExpr
- xpath_relation_child_combinator(left: XPathExpr, right: XPathExpr) XPathExpr
right is an immediate child of left; select left
- xpath_relation_descendant_combinator(left: XPathExpr, right: XPathExpr) XPathExpr
right is a child, grand-child or further descendant of left; select left
- xpath_relation_direct_adjacent_combinator(left: XPathExpr, right: XPathExpr) XPathExpr
right is a sibling immediately after left; select left
- xpath_relation_indirect_adjacent_combinator(left: XPathExpr, right: XPathExpr) XPathExpr
right is a sibling after left, immediately or not; select left
- xpath_root_pseudo(xpath: XPathExpr) XPathExpr
- xpath_scope_pseudo(xpath: XPathExpr) XPathExpr
- xpath_specificityadjustment(matching: SpecificityAdjustment) XPathExpr
- xpath_target_pseudo(xpath: XPathExpr) XPathExpr
Common implementation for pseudo-classes that never match.
- xpath_visited_pseudo(xpath: XPathExpr) XPathExpr
Common implementation for pseudo-classes that never match.
- attribute_operator_mapping = {'!=': 'different', '$=': 'suffixmatch', '*=': 'substringmatch', '=': 'equals', '^=': 'prefixmatch', 'exists': 'exists', '|=': 'dashmatch', '~=': 'includes'}
- combinator_mapping = {' ': 'descendant', '+': 'direct_adjacent', '>': 'child', '~': 'indirect_adjacent'}
- id_attribute = 'id'
The attribute used for ID selectors depends on the document language: http://www.w3.org/TR/selectors/#id-selectors
- lang_attribute = 'xml:lang'
The attribute used for
:lang()
depends on the document language: http://www.w3.org/TR/selectors/#lang-pseudo
- lower_case_attribute_names = False
- lower_case_attribute_values = False
- lower_case_element_names = False
The case sensitivity of document language element names, attribute names, and attribute values in selectors depends on the document language. http://www.w3.org/TR/selectors/#casesens
When a document language defines one of these as case-insensitive, cssselect assumes that the document parser makes the parsed values lower-case. Making the selector lower-case too makes the comparaison case-insensitive.
In HTML, element names and attributes names (but not attribute values) are case-insensitive. All of lxml.html, html5lib, BeautifulSoup4 and HTMLParser make them lower-case in their parse result, so the assumption holds.