← Back to team overview

desktop-packages team mailing list archive

[Bug 480130] Re: [upstream] Calc truncates data from HTML based .xls

 

Launchpad has imported 9 comments from the remote bug at
https://bugs.freedesktop.org/show_bug.cgi?id=35756.

If you reply to an imported comment from within Launchpad, your comment
will be sent to the remote bug automatically. Read more about
Launchpad's inter-bugtracker facilities at
https://help.launchpad.net/InterBugTracking.

------------------------------------------------------------------------
On 2011-03-29T01:14:21+00:00 Marco wrote:

By importing a document (a HTML file named as .XLS or .HTML) with lots (more
than 15000) of rows, OpenOffice Calc truncates data without showing any error
or warning.

The issue can be reproduced by importing attached file into calc. On my machine
it imports just 13635 of the 20000 rows.

Reply at: https://bugs.launchpad.net/df-
libreoffice/+bug/480130/comments/6

------------------------------------------------------------------------
On 2011-03-29T01:18:17+00:00 Marco wrote:

Created attachment 44977
20000 rows as HTML table

Reply at: https://bugs.launchpad.net/df-
libreoffice/+bug/480130/comments/7

------------------------------------------------------------------------
On 2011-03-29T05:41:03+00:00 Libreoffice-z wrote:

The effect is reproducible with reporter's sample document and
"LibreOffice 3.3.2  – WIN7  Home Premium  (64bit) German UI [OOO330m19
(Build:202 / tag 3.3.2.2)]"

I saw a lot of documents with name extension .xls having nothing to do
with an EXCEL spreadsheet, the user or his application only used that
name because of "somehow table contents".

To be honest, I do not know much about EXCEL HTML document, except that
it is a mess to work with them. Imho that's an EXCEL problem, EXCEL
should create documents with correct syntax.

Reporter's sample is no correct html, although source text is pretending to be html. At least html type information is missing.
I'ts also not an EXCEL type spreadsheet.

MS EXCEL viewer will not open that document.

Some other observations:
OOo3.1.1. (from open WRITER document) will by default open the document as WRITER-HTML document in writer with correct table view until "A12800", then table view stops and strings from table will be shown as endless plain text line.
I can force OOo to open the document as html-calc, then it will open the document as spreadsheet, "E13105" is the latest content shown correctly, then table formatting breaks.

Exactly the same with OOo-dev 3.4

My result:
My aversion against such documents has nothing to do with the reported problem, LibO should reject the document or open it correctly (may be with a warning message). Low priority, imprtant data should be exported to a document with correct syntax, that's a problem of the application creating such documents.

@Marco:
You get such documents from what application?

Reply at: https://bugs.launchpad.net/df-
libreoffice/+bug/480130/comments/8

------------------------------------------------------------------------
On 2011-03-29T05:54:31+00:00 Libreoffice-z wrote:

Although the "html" code is completely different, I see something similar to the reported problem with the attachment of OOo bug
 Bug 111579 -  Opening large html excel document from SAS  
<http://openoffice.org/bugzilla/show_bug.cgi?id=111579>
Opening that document with LibO CALC (from WIN Explorer) the last correctly shown cell 'F6712' will have contents "PXXX09.001.AAAA.BBBB 1728". Next cell will be broken, no further contents will be shown, Table ends with date 15/09/2009

Renaming document to .html and opening with Seamonky shows: there is
much ocntents behind "15/09/2009"

Reply at: https://bugs.launchpad.net/df-
libreoffice/+bug/480130/comments/9

------------------------------------------------------------------------
On 2011-03-29T06:27:28+00:00 Marco wrote:

(In reply to comment #3)
> Although the "html" code is completely different, I see something similar to
> the reported problem with the attachment of OOo bug
>  Bug 111579 -  Opening large html excel document from SAS  
> <http://openoffice.org/bugzilla/show_bug.cgi?id=111579>
> Opening that document with LibO CALC (from WIN Explorer) the last correctly
> shown cell 'F6712' will have contents "PXXX09.001.AAAA.BBBB 1728". Next cell
> will be broken, no further contents will be shown, Table ends with date
> 15/09/2009
> 
> Renaming document to .html and opening with Seamonky shows: there is much
> ocntents behind "15/09/2009"

Yes I agree, it seems to be same issue.

Reply at: https://bugs.launchpad.net/df-
libreoffice/+bug/480130/comments/10

------------------------------------------------------------------------
On 2011-03-29T06:39:44+00:00 Marco wrote:

(In reply to comment #3)
> Although the "html" code is completely different, I see something similar to
> the reported problem with the attachment of OOo bug
>  Bug 111579 -  Opening large html excel document from SAS  
> <http://openoffice.org/bugzilla/show_bug.cgi?id=111579>
> Opening that document with LibO CALC (from WIN Explorer) the last correctly
> shown cell 'F6712' will have contents "PXXX09.001.AAAA.BBBB 1728". Next cell
> will be broken, no further contents will be shown, Table ends with date
> 15/09/2009
> 
> Renaming document to .html and opening with Seamonky shows: there is much
> ocntents behind "15/09/2009"

The .XLS extension is used for users convenience - as those extensions are
associated with LibreOffice or MS Excel by default.

Trying with MS Excel 2010, it imports that example file without a problem. It
just showed a warning that it's not an Excel file.

Such files are generated by applications which cannot create native .XLS (or
.XLSX). The example file is one I was creating manually to demonstrate the
issue.


However, the main issue I see here is that LibreOffice cannot import huge HTML
tables. It should either import the whole data or show warning message.

Reply at: https://bugs.launchpad.net/df-
libreoffice/+bug/480130/comments/11

------------------------------------------------------------------------
On 2011-08-22T07:14:10+00:00 Ctibor-brancik wrote:

I can confirm this bug too in libreoffice 3.4.2. Happens for me on
slightly less huge tables with around 3000 rows. The interesting thing
is, that borders of the table are rendered to the last row, but data are
truncated randomly in each file somewhere in the middle.

Reply at: https://bugs.launchpad.net/df-
libreoffice/+bug/480130/comments/12

------------------------------------------------------------------------
On 2011-12-23T19:45:48+00:00 Björn Michaelsen wrote:

[This is an automated message.]
This bug was filed before the changes to Bugzilla on 2011-10-16. Thus it
started right out as NEW without ever being explicitly confirmed. The bug is
changed to state NEEDINFO for this reason. To move this bug from NEEDINFO back
to NEW please check if the bug still persists with the 3.5.0 beta1 or beta2 prereleases.
Details on how to test the 3.5.0 beta1 can be found at:
http://wiki.documentfoundation.org/QA/BugHunting_Session_3.5.0.-1

more detail on this bulk operation: http://nabble.documentfoundation.org
/RFC-Operation-Spamzilla-tp3607474p3607474.html

Reply at: https://bugs.launchpad.net/df-
libreoffice/+bug/480130/comments/15

------------------------------------------------------------------------
On 2011-12-24T07:04:19+00:00 Marco wrote:

The issue is still open and reproducible with "3.5.0 beta2".

Reply at: https://bugs.launchpad.net/df-
libreoffice/+bug/480130/comments/16


** Bug watch added: openoffice.org/bugzilla/ #111579
   http://openoffice.org/bugzilla/show_bug.cgi?id=111579

-- 
You received this bug notification because you are a member of Desktop
Packages, which is subscribed to libreoffice in Ubuntu.
https://bugs.launchpad.net/bugs/480130

Title:
  [upstream] Calc truncates data from HTML based .xls

Status in LibreOffice Productivity Suite:
  Confirmed
Status in The OpenOffice.org Suite:
  Confirmed
Status in “libreoffice” package in Ubuntu:
  Triaged
Status in “openoffice.org” package in Ubuntu:
  Won't Fix

Bug description:
  Binary package hint: openoffice.org

  1) lsb_release -rd
  Description:	Ubuntu 11.04
  Release:	11.04

  2) apt-cache policy libreoffice-calc
  libreoffice-calc:
    Installed: 1:3.3.2-1ubuntu5
    Candidate: 1:3.3.2-1ubuntu5
    Version table:
   *** 1:3.3.2-1ubuntu5 0
          500 http://us.archive.ubuntu.com/ubuntu/ natty-updates/main i386 Packages
          100 /var/lib/dpkg/status
       1:3.3.2-1ubuntu4 0
          500 http://us.archive.ubuntu.com/ubuntu/ natty/main i386 Packages

  3) What is expected to happen in LibreOffice Calc via the Terminal:

  cd ~/Desktop && wget
  https://bugs.launchpad.net/ubuntu/+source/openoffice.org/+bug/480130/+attachment/1019499/+files/OE_Enrollment_Audit_20091110114007-C55555.2.xls
  && localc -nologo OE_Enrollment_Audit_20091110114007-C55555.2.xls

  is the file displays all 12384 rows.

  4) What happens instead is it only displays data for the first 6700
  rows. It shows border formatting for rows 6701 to 12384, but no data.

  WORKAROUND: Use Gnumeric.

  apt-cache policy gnumeric
  gnumeric:
    Installed: 1.10.13-1ubuntu1
    Candidate: 1.10.13-1ubuntu1
    Version table:
   *** 1.10.13-1ubuntu1 0
          500 http://us.archive.ubuntu.com/ubuntu/ natty/universe i386 Packages
          100 /var/lib/dpkg/status

  WORKAROUND: Excel 2003 via WINE.

  Microsoft Office Excel 2003 (11.5612.6505)

  apt-cache policy wine1.3
  wine1.3:
    Installed: 1.3.19-0ubuntu1~maverick1~ppa1
    Candidate: 1.3.19-0ubuntu1~maverick1~ppa1
    Version table:
   *** 1.3.19-0ubuntu1~maverick1~ppa1 0
          100 /var/lib/dpkg/status
       1.3.15-0ubuntu5 0
          500 http://us.archive.ubuntu.com/ubuntu/ natty/universe i386 Packages

  Original Report Comments: I have a Web App that generates HTML based
  spreadsheets with very simple tables, amount displayed varies with
  length of cell contents.

  ProblemType: Bug
  Architecture: i386
  Date: Tue Nov 10 11:42:07 2009
  DistroRelease: Ubuntu 9.10
  InstallationMedia: Ubuntu 9.10 "Karmic Koala" - Release i386 (20091028.5)
  Package: openoffice.org-core 1:3.1.1-5ubuntu1
  ProcEnviron:
   PATH=(custom, no user)
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcVersionSignature: Ubuntu 2.6.31-14.48-generic
  SourcePackage: openoffice.org
  Uname: Linux 2.6.31-14-generic i686

To manage notifications about this bug go to:
https://bugs.launchpad.net/df-libreoffice/+bug/480130/+subscriptions