launchpad-reviewers team mailing list archive
-
launchpad-reviewers team
-
Mailing list archive
-
Message #04852
[Merge] lp:~adeuring/launchpad/hwdb-test-natty-mess into lp:launchpad
Abel Deuring has proposed merging lp:~adeuring/launchpad/hwdb-test-natty-mess into lp:launchpad.
Requested reviews:
Launchpad code reviewers (launchpad-reviewers)
For more details, see:
https://code.launchpad.net/~adeuring/launchpad/hwdb-test-natty-mess/+merge/73722
This branch fixes the problem that HWDB submissions from Natty can not be processed.
The reason for the failure is simple: The submission processing script expects an XML file with a structure as defined in lib/canonical/launchpad/scripts/hardware-1_0.rng
One part of the required XML data looks like
<hardware>
<udev>(lots of text data)
</udev>
<dmi>(more text data)
</dmi>
(other tags)
</hardware>
The reports from Natty do not contain the tags <udev> and <dmi>; their content is instead stored in
<context>
<info command="udevadm info --export-db">(text data)
</info>
<info command="grep -r . /sys/class/dmi/id/ 2>/dev/null">(test data)
</info>
</context>
The fix is quite simple: Two regular expressions check if the tags <udev> and <dmi> exist within <hardware>; if not, these tags are inserted, provided that the corresponding <info> nodes can be found.
This is a bit of a "brute force" fix. I tried at first to modify the etree structure, but then flacoste gave me the hint that the RelaxNG validation is _not_ on the etree representaion of the data, but on the "text string". Creating the etree, modifying it, writing its contents into a StringIO buffer, and finally passing the buffer's content to the RNG validator should have worked too, but would be much slower.
A note about the diff: The first 480 lines are the content of a new file with test data. Its content is a variant of the already existing file lib/canonical/launchpad/scripts/tests/hardwaretest-udev.xml and hence probably easier to understand by running
diff -u hardwaretest-udev.xml hardwaretest-natty.xml
in lib/canonical/launchpad/scripts/tests
test: ./bin/test hardwaredb -vvt test_hwdb_submission_validation
--
https://code.launchpad.net/~adeuring/launchpad/hwdb-test-natty-mess/+merge/73722
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~adeuring/launchpad/hwdb-test-natty-mess into lp:launchpad.
=== added file 'lib/canonical/launchpad/scripts/tests/hardwaretest-natty.xml'
--- lib/canonical/launchpad/scripts/tests/hardwaretest-natty.xml 1970-01-01 00:00:00 +0000
+++ lib/canonical/launchpad/scripts/tests/hardwaretest-natty.xml 2011-09-01 20:51:27 +0000
@@ -0,0 +1,473 @@
+<?xml version="1.0" ?>
+<system version="1.0">
+
+ <!-- Reports coming from the checkbox version in Natty do not
+ have the tags <udev> and <dmi> inside <hardware>. Instead,
+ they store the data expected in these tags in the tags
+
+ <info command="udevadm info - -export-db"> and
+ <info command="grep -r . /sys/class/dmi/id/ 2>/dev/null">
+
+ inside <
+ -->
+ <!-- summary: generic information about the submission -->
+ <summary>
+
+ <!-- live_cd: Was this submission made on a system running an Ubuntu Live
+ CD or on a regular Ubuntu/Linux installation?
+ -->
+ <live_cd value="False"/>
+
+ <!-- system_id: A hash of the "system identifier". This value is intended
+ to identify the tested computer model; the value should
+ be derived from the properties
+ system.product, system.vendor of the HAL UDI
+ /org/freedesktop/Hal/devices/computer.
+ -->
+ <system_id value="f982bb1ab536469cebfd6eaadcea0ffc"/>
+
+ <!-- distribution, distroseries: These values are retrieved from
+ /etc/lsb-release, parameters DISTRIB_ID and DISTRIB_RELEASE.
+ -->
+ <distribution value="Ubuntu"/>
+ <distroseries value="7.04"/>
+
+ <!-- architecture: The processor architecture of the operating system.
+ -->
+ <architecture value="amd64"/>
+
+ <!-- private: If False, this submission is publicly accessible from
+ Launchpad, else it is only accesible by the submitter, by
+ Launchpad administrators and by scripts running with
+ administrator rights. Submissions marked "private" should
+ only be used to gather statistical data.
+ -->
+ <private value="False"/>
+
+ <!-- contactable: If True, the owner agrees to be contacted by other
+ persons about devices which appear in his submission.
+ Example of a use case: Developers can ask device owners
+ to perform tests.
+ -->
+ <contactable value="False"/>
+
+ <!-- date_created: Date and time (UTC) of the submission.
+ -->
+ <date_created value="2007-09-28T16:09:20.126842"/>
+
+ <!-- client: The name and version of the program that created the
+ submission data.
+ -->
+ <client name="hwtest" version="0.9">
+
+ <!-- plugin: name and version of a plugin used by the client.
+ This tag may appear more than once.
+ -->
+ <plugin name="architecture_info" version="1.1"/>
+ <plugin name="find_network_controllers" version="2.34"/>
+ <plugin name="internet_ping" version="1.1"/>
+ <plugin name="harddisk_speed" version="0.7"/>
+ </client>
+
+ <!-- The kernel name and version, as shown by "uname -r"
+ -->
+ <kernel-release value="2.6.28-14-generic"/>
+ </summary>
+
+ <!-- hardware: data about the hardware the submission was made on.
+ -->
+ <hardware>
+
+ <!-- udev: The output of running "udevadm info - -export-db" -->
+
+
+ <!-- Additional data for SCSI devices: vendor, model, type
+
+ For each udev node which has DEVTYPE=scsi_device, we need
+ the content of the sysfs files vendor, model, type. The data
+ is stored in the same format as the DMI data:
+ /path/to/file:filecontent
+ -->
+ <sysfs-attributes>
+P: /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0
+A: modalias=input:b0019v0000p0001e0000-e0,1,k74,ramlsfw
+A: uniq=
+A: phys=LNXPWRBN/button/input0
+A: name=Power Button
+
+P: /devices/LNXSYSTM:00/device:00/PNP0A08:00/device:03/input/input8
+A: modalias=input:b0019v0000p0006e0000-e0,1,kE0,E1,E3,F0,F1,F2,F3,F4,F5,ramlsfw
+A: uniq=
+A: phys=/video/input0
+A: name=Video Bus
+</sysfs-attributes>
+
+ <!-- processors: Data about processors installed in a system.
+ The data is retrieved from /proc/cpuinfo.
+ -->
+ <processors>
+
+ <!-- processor: Data from /proc/cpuinfo about a single processor.
+ -->
+ <processor id="123" name="0">
+
+ <!-- property: The data of one line of /proc/cpuinfo.
+ attribute name: The name of the property
+ (the text left of the ':' in a line of /proc/cpuinfo)
+ attribute type: A Python type appropriate for the value.
+ -->
+ <property name="wp" type="bool">
+ True
+ </property>
+ <property name="flags" type="list">
+ <value type="str">
+ fpu
+ </value>
+ <value type="str">
+ vme
+ </value>
+ <value type="str">
+ de
+ </value>
+ </property>
+ <property name="cpu_mhz" type="float">
+ 1000.0
+ </property>
+ </processor>
+ </processors>
+
+ <!-- aliases: optional data provided by the user:
+ The name of a peripheral, PCI card etc as shown by a label on
+ the device. OEM devices are often sold under different names
+ by different vendors; having a set of alias names for a device
+ allows users of the HWDB to search for information by these
+ "marketing names".
+ -->
+ <aliases>
+ <!-- alias: The "label name" of a device or system.
+ attribute target: The sysfs path of a device as given
+ in <udev>.
+ -->
+ <alias target="/devices/pci0000:00/0000:00:1f.1/host3/target3:0:0/3:0:0:0">
+
+ <!-- vendor: The vendor name shown on the device label.
+ -->
+ <vendor>Medion</vendor>
+
+ <!-- model: The model name of shown on the label.
+ -->
+ <model>QuickPrint 9876</model>
+ </alias>
+ </aliases>
+ </hardware>
+
+ <!-- software: Data about the software installed on the system.
+ -->
+ <software>
+
+ <!-- lsbrelease: The data from /etc/lsb-release.
+ -->
+ <lsbrelease>
+
+ <!-- property: the data from one line of /etc/lsb-release.
+ attribute type: A Python type appropriate for this
+ property (str).
+ -->
+ <property name="release" type="str">
+ 7.04
+ </property>
+ <property name="codename" type="str">
+ feisty
+ </property>
+ <property name="distributor-id" type="str">
+ Ubuntu
+ </property>
+ <property name="description" type="str">
+ Ubuntu 7.04
+ </property>
+ <property name="dict_example" type="dict">
+ <value name="a" type="str">value for key a</value>
+ <value name="b" type="int">1234</value>
+ </property>
+ </lsbrelease>
+
+ <!-- packages: Data about the installed software packages.
+ -->
+ <packages>
+
+ <!-- package: Data about a single package.
+ The <property> sub-tags contain the DEB properties
+ "name", "priority", "section", "source", "version",
+ "installed_size", "size", "summary".
+
+ XXX Abel Deuring 2007-12-12: What about submissions
+ from RPM-based Linux versions? (And "exotic" variants
+ like Gentoo?)
+ -->
+ <package name="metacity" id="200">
+ <property name="installed_size" type="int">
+ 868352
+ </property>
+ <property name="section" type="str">
+ x11
+ </property>
+ <property name="summary" type="str">
+ A lightweight GTK2 based Window Manager
+ </property>
+ <property name="priority" type="str">
+ optional
+ </property>
+ <property name="source" type="str">
+ metacity
+ </property>
+ <property name="version" type="str">
+ 1:2.18.2-0ubuntu1.1
+ </property>
+ <property name="size" type="int">
+ 429128
+ </property>
+ </package>
+ </packages>
+ <!-- Information extracted from Xorg.0.log.
+ HAL does not provide any information about Xorg drivers, so
+ we retrieve that from the xserver's log file.
+ -->
+ <xorg version="1.3.0">
+ <!-- driver: Data about a driver.
+ (optional)
+ attribute name: The name of the driver.
+ attribute version: The version of the driver.
+ attribute class: The module class of the driver
+ attribute device: The ID of a device driven by this driver.
+ -->
+ <driver name="fglrx" version="1.23" class="X.Org Video Driver"
+ device="12"/>
+ </xorg>
+ </software>
+
+ <!-- questions: User's answers to questions asked by the client.
+ -->
+ <questions>
+
+ <!-- question: Data of a question.
+ attribute name: The unique name of the question.
+ attribute plugin: The name of the plugin which asked
+ the question.
+ attribute version: The version of the question.
+ attribute type: Allowed values are "manual" and "automatic".
+ A "manual" question requires user input for the answer;
+ an "automatic" question gets the answer automatically.
+ -->
+ <question name="detected_network_controllers"
+ plugin="find_network_controllers">
+
+ <!-- target: Information about a device or software package the
+ question is about. The attribute "id" is the sysfs path of
+ a device as given in <udev>, or the ID of a software package
+ node.
+ This node may appear multiple times.
+ -->
+ <target id="/devices/pci0000:00/0000:00:1a.0/usb3/3-0:1.0/usb_endpoint/usbdev3.1_ep81">
+ <!-- driver: The driver which controls the target device. This tag
+ may appear more than once.
+
+ While we are working on a project called "Hardware Database",
+ we are not that much interested in the question, if a device
+ works "as such", but if their Linux driver(s) work.
+
+ It is not in every case possible to identify the used driver
+ from HAL data, so we need another way to add this information.
+ (example: HAL does not know, which driver is used for the
+ graphics card.)
+
+ Example for multiple drivers: Some scanners have a SCSI _and_
+ a USB interface; if such a scanner is tested, we not only want
+ to know, which Sane backend is used, but also, which interface
+ is used.
+
+ Also, it might be interesting to know for many USB 2.0 devices,
+ if a USB 1 (uhci_hcd or ohci_hcd driver) or the USB 2 driver
+ (ehci_hcd) was used. A USB 1 driver might for example explain
+ latency problems.
+ -->
+ <driver>ipw3945</driver>
+ </target>
+
+ <!-- ID of the 88E8055 PCI-E Gigabit Ethernet Controller -->
+ <target id="/devices/pci0000:00/0000:00:1f.1"/>
+
+ <!-- command: The command line of an external command required to
+ ask this question.
+ -->
+ <command/>
+
+ <!-- answer: The answer to the question. Two types of answers are
+ defined, "multiple_choice" and "measurement". (See below
+ for an example of the latter.)
+ attribute type: Must be "multiple_choice" or "measurement".
+ -->
+ <answer type="multiple_choice">pass</answer>
+
+ <!-- answer_choices: The list of possible choices.
+ The data should only be used for consistency
+ checks and to detect variants of the question.
+ -->
+ <answer_choices>
+ <value type="str">fail</value>
+ <value type="str">pass</value>
+ <value type="str">skip</value>
+ </answer_choices>
+
+ <!-- A user comment about the device or about the test.
+ -->
+ <comment>
+ The WLAN adapter drops the connection very frequently.
+ </comment>
+ </question>
+
+ <question name="internet_ping"
+ plugin="internet_ping">
+ <target id="/devices/pci0000:00/0000:00:1f.1"/>
+ <command/>
+ <answer type="multiple_choice">pass</answer>
+ <answer_choices>
+ <value type="str">fail</value>
+ <value type="str">pass</value>
+ <value type="str">skip</value>
+ </answer_choices>
+ </question>
+
+ <!-- example for a "measurement question"
+ -->
+ <question name="harddisk_speed"
+ plugin="harddisk_speed">
+ <target id="/devices/pci0000:00/0000:00:1f.1/host3/target3:0:0/3:0:0:0"/>
+ <command>hdparm -t /dev/sda</command>
+ <!-- answer: The answer to a "measurement question".
+ attribute type: See above.
+ attribute unit: The unit of the result of the measurement.
+ XXX Abel Deuring 2007-12-12 bug=175978 We should
+ enumerate a list of allowed units, in order to avoid
+ multiple units for the same dimension. e.g., B/sec,
+ MB/sec or inch, cm, foot.
+
+ For dimensionless values, the attribute unit is omitted.
+
+ "Percentage" and similar "convenience pseudo-units" like
+ ppm are _not_ allowed; instead a dimensionless
+ value must be used, where 0 is equivalent 0% and 1.0 is
+ equivalent to 100%.
+ -->
+ <answer type="measurement" unit="MB/sec">38.4</answer>
+ </question>
+ </questions>
+ <!-- miscellaneous additional text data.
+ -->
+ <context>
+ <info command="udevadm info --export-db">P: /devices/LNXSYSTM:00
+E: UDEV_LOG=3
+E: DEVPATH=/devices/LNXSYSTM:00
+E: MODALIAS=acpi:LNXSYSTM:
+
+P: /devices/pci0000:00/0000:00:1a.0
+E: UDEV_LOG=3
+E: DEVPATH=/devices/pci0000:00/0000:00:1a.0
+E: DRIVER=uhci_hcd
+E: PCI_CLASS=C0300
+E: PCI_ID=8086:2834
+E: PCI_SUBSYS_ID=17AA:20AA
+E: PCI_SLOT_NAME=0000:00:1a.0
+E: MODALIAS=pci:v00008086d00002834sv000017AAsd000020AAbc0Csc03i00
+
+P: /devices/pci0000:00/0000:00:1a.0/usb3
+N: bus/usb/003/001
+S: char/189:256
+E: UDEV_LOG=3
+E: DEVPATH=/devices/pci0000:00/0000:00:1a.0/usb3
+E: MAJOR=189
+E: MINOR=256
+E: DEVTYPE=usb_device
+E: DRIVER=usb
+E: DEVICE=/proc/bus/usb/003/001
+E: PRODUCT=1d6b/1/206
+E: TYPE=9/0/0
+E: BUSNUM=003
+E: DEVNUM=001
+E: DEVNAME=/dev/bus/usb/003/001
+E: DEVLINKS=/dev/char/189:256
+
+P: /devices/pci0000:00/0000:00:1a.0/usb3/3-0:1.0
+E: UDEV_LOG=3
+E: DEVPATH=/devices/pci0000:00/0000:00:1a.0/usb3/3-0:1.0
+E: DEVTYPE=usb_interface
+E: DRIVER=hub
+E: DEVICE=/proc/bus/usb/003/001
+E: PRODUCT=1d6b/1/206
+E: TYPE=9/0/0
+E: INTERFACE=9/0/0
+E: MODALIAS=usb:v1D6Bp0001d0206dc09dsc00dp00ic09isc00ip00
+
+P: /devices/pci0000:00/0000:00:1a.0/usb3/3-0:1.0/usb_endpoint/usbdev3.1_ep81
+N: usbdev3.1_ep81
+S: char/252:4
+E: UDEV_LOG=3
+E: DEVPATH=/devices/pci0000:00/0000:00:1a.0/usb3/3-0:1.0/usb_endpoint/usbdev3.1_ep81
+E: MAJOR=252
+E: MINOR=4
+E: DEVNAME=/dev/usbdev3.1_ep81
+E: DEVLINKS=/dev/char/252:4
+
+P: /devices/pci0000:00/0000:00:1f.1
+E: UDEV_LOG=3
+E: DEVPATH=/devices/pci0000:00/0000:00:1f.1
+E: DRIVER=ata_piix
+E: PCI_CLASS=1018A
+E: PCI_ID=8086:2850
+E: PCI_SUBSYS_ID=17AA:20A6
+E: PCI_SLOT_NAME=0000:00:1f.1
+E: MODALIAS=pci:v00008086d00002850sv000017AAsd000020A6bc01sc01i8a
+
+P: /devices/pci0000:00/0000:00:1f.1/host3
+E: UDEV_LOG=3
+E: DEVPATH=/devices/pci0000:00/0000:00:1f.1/host3
+E: DEVTYPE=scsi_host
+
+P: /devices/pci0000:00/0000:00:1f.1/host3/scsi_host/host3
+E: UDEV_LOG=3
+E: DEVPATH=/devices/pci0000:00/0000:00:1f.1/host3/scsi_host/host3
+
+P: /devices/pci0000:00/0000:00:1f.1/host3/target3:0:0
+E: UDEV_LOG=3
+E: DEVPATH=/devices/pci0000:00/0000:00:1f.1/host3/target3:0:0
+E: DEVTYPE=scsi_target
+
+P: /devices/pci0000:00/0000:00:1f.1/host3/target3:0:0/3:0:0:0
+E: UDEV_LOG=3
+E: DEVPATH=/devices/pci0000:00/0000:00:1f.1/host3/target3:0:0/3:0:0:0
+E: DEVTYPE=scsi_device
+E: DRIVER=sr
+E: MODALIAS=scsi:t-0x05
+</info>
+
+ <!-- The content of publicly accessible files in /sys/class/dmi/id/
+ format: filename:content
+ as for example generated by "grep -r . /sys/class/dmi/id/"
+ -->
+
+ <info command="grep -r . /sys/class/dmi/id/ 2>/dev/null">/sys/class/dmi/id/bios_vendor:LENOVO
+/sys/class/dmi/id/bios_version:7LETB9WW (2.19 )
+/sys/class/dmi/id/bios_date:06/06/2008
+/sys/class/dmi/id/sys_vendor:LENOVO
+/sys/class/dmi/id/product_name:6457BAG
+/sys/class/dmi/id/product_version:ThinkPad T61
+/sys/class/dmi/id/board_vendor:LENOVO
+/sys/class/dmi/id/board_name:6457BAG
+/sys/class/dmi/id/board_version:Not Available
+/sys/class/dmi/id/chassis_vendor:LENOVO
+/sys/class/dmi/id/chassis_type:10
+/sys/class/dmi/id/chassis_version:Not Available
+/sys/class/dmi/id/chassis_asset_tag:No Asset Information
+/sys/class/dmi/id/modalias:dmi:bvnLENOVO:bvr7LETB9WW(2.19)
+</info>
+ </context>
+</system>
=== modified file 'lib/lp/hardwaredb/scripts/hwdbsubmissions.py'
--- lib/lp/hardwaredb/scripts/hwdbsubmissions.py 2011-08-31 16:38:48 +0000
+++ lib/lp/hardwaredb/scripts/hwdbsubmissions.py 2011-09-01 20:51:27 +0000
@@ -72,6 +72,13 @@
re.VERBOSE)
_broken_comment_nodes_re = re.compile('(<comment>.*?</comment>)', re.DOTALL)
+_missing_udev_node_data = re.compile(
+ '<info command="udevadm info --export-db">(.*?)</info>', re.DOTALL)
+_missing_dmi_node_data = re.compile(
+ r'<info command="grep -r \. /sys/class/dmi/id/ 2>/dev/null">(.*?)'
+ '</info>', re.DOTALL)
+_udev_node_exists = re.compile('<hardware>.*?<udev>.*?</hardware>', re.DOTALL)
+_dmi_node_exists = re.compile('<hardware>.*?<dmi>.*?</hardware>', re.DOTALL)
ROOT_UDI = '/org/freedesktop/Hal/devices/computer'
UDEV_ROOT_PATH = '/devices/LNXSYSTM:00'
@@ -166,7 +173,36 @@
# A considerable number of reports for Lucid has ESC characters
# in comment nodes. We don't need the comment nodes at all, so
# we can simply empty them.
- return _broken_comment_nodes_re.sub('<comment/>', submission)
+ submission = _broken_comment_nodes_re.sub('<comment/>', submission)
+
+ # Submissions from Natty don't have the nodes <dmi> and <udev>
+ # as children of the <hardware> node. Fortunately, they provide
+ # this data in
+ #
+ # <context>
+ # <info command="grep -r . /sys/class/dmi/id/ 2>/dev/null">
+ # ...
+ # </info>
+ # <info command="udevadm info --export-db">
+ # ...
+ # </info>
+ # </context>
+ #
+ # We can try to find the two relevant <info> nodes inside <context>
+ # and move their content into the proper subnodes of <hardware>.
+ if _udev_node_exists.search(submission) is None:
+ mo = _missing_udev_node_data.search(submission)
+ if mo is not None:
+ missing_data = mo.group(1)
+ missing_data = '<udev>%s</udev>\n</hardware>' % missing_data
+ submission = submission.replace('</hardware>', missing_data)
+ if _dmi_node_exists.search(submission) is None:
+ mo = _missing_dmi_node_data.search(submission)
+ if mo is not None:
+ missing_data = mo.group(1)
+ missing_data = '<dmi>%s</dmi>\n</hardware>' % missing_data
+ submission = submission.replace('</hardware>', missing_data)
+ return submission
def _getValidatedEtree(self, submission, submission_key):
"""Create an etree doc from the XML string submission and validate it.
=== modified file 'lib/lp/hardwaredb/tests/test_hwdb_submission_validation.py'
--- lib/lp/hardwaredb/tests/test_hwdb_submission_validation.py 2011-08-31 16:38:48 +0000
+++ lib/lp/hardwaredb/tests/test_hwdb_submission_validation.py 2011-09-01 20:51:27 +0000
@@ -2823,3 +2823,22 @@
submission_id, result,
'Extra element context in interleave',
'detection of an invalid sub.node of <info> failed')
+
+ def test_natty_reports_validate(self):
+ # HWDB submissions from Natty can be processed.
+ # the raw data from these reports would be passed directly
+ # to the RelaxNG validator, they would fail, because they
+ # do not have the sub-nodes <dmi> and <udev> inside <hardware>.
+ # The data is stored instead in the nodes
+ # <info command="grep -r . /sys/class/dmi/id/ 2>/dev/null">
+ # and <info command="udevadm info --export-db">
+ #
+ # The method SubmissionParser.fixFrequentErrors() (called by
+ # _getValidatedEtree()) creates the missing nodes, so that
+ # _getValidatedEtree() succeeds.
+ sample_data_path = os.path.join(
+ config.root, 'lib', 'canonical', 'launchpad', 'scripts',
+ 'tests', 'hardwaretest-natty.xml')
+ sample_data = open(sample_data_path).read()
+ result, submission_id = self.runValidator(sample_data)
+ self.assertIsNot(None, result)