Knowledgebase

Parsing .xlsx files

Posted by Doh004, 02-08-2010, 11:18 PM
Hey guys, I'm doing some work and I need to support parsing data out of Excel 2007 files. I've been looking around a bit, but the one PHP excel library I found only works for .xls files. I can program in PHP and Python, so I could probably combine the two to accomplish this. Anyone have experience with .xlsx files?

Posted by luki, 02-09-2010, 02:18 AM
Google finds this: http://phpexcel.codeplex.com/wikipag...Title=Examples Sounds like it should work for you, it reads and writes Excel 2007 files.

Posted by e-Sensibility, 02-09-2010, 02:57 AM
I know you said php or python, but I've had a decent experience with this perl module in the recent past http://search.cpan.org/~dmow/Spreads...dsheet/XLSX.pm Most perl syntax will carry over from php, so using this module may be a viable alternative for you if you can't find something better in one of your preferred languages.

Posted by mattle, 02-09-2010, 09:36 AM
Caveat: You must have php_zip installed to read/write the docx formats. Otherwise, PHPExcel is a helluva tool. I use it all the time for generating reports for accountants for whom I write software.

Posted by Doh004, 02-09-2010, 12:31 PM
I had looked into PHPExcel, but I felt that there was so much there that I don't need. Not looking to write/create any excel files. Looking back on it now, I feel it might just be my best option now. Also, I guess I could try out perl. For some reason the language has always scared me

Posted by Host Ahead, 02-09-2010, 12:34 PM
Hi, xslx files are actually just zip files, so you need a zip component to unzip the file. When you unzip the file you will find xml files which describe your content. Try opening a .xslx-file with winzip or winrar and you'll see the files, so you get to know the structure.

Posted by mattle, 02-09-2010, 12:55 PM
I'd be happy to help you out with your Perl questions Love Perl. That said, I disagree with your rationale for steering away from PHPExcel. What I like about it is that you get an Excel-like API. I.e., most of the functions correspond to something you're used to seeing in Excel, for example: It's not much different than spreadsheets in general. Just because I'm not going to use VLOOKUPs and cross-sheet references doesn't mean that Excel isn't still a good tool for doing my to-do list...

Posted by Doh004, 02-10-2010, 08:38 PM
So I took to your guys' advice, and started using PHPExcel, which is quite a program. I'm able to import the spreadsheets just fine, but I honestly have no clue how to start parsing through the information. Is there a way once I have the PHPExcel object, I can convert it to some XML or something I could travese through easily? I've tried exporting it to a HTML table, and then converting that to something traversable, but I've gotten a bunch of stuff going wrong with that. Thanks.

Posted by mattle, 02-11-2010, 12:31 AM
I guess that depends on how you plan to traverse it...there is a Worksheet Iterator: http://www.lindasbusinesscenter.com/...tIterator.html If you want to parse XML, you might as well just unzip the file and parse XML! That's the native format anyway. Maybe if you can describe the format of the worksheets you're working with and how you want to capture the data, I can be more helpful.

Posted by Doh004, 02-11-2010, 10:17 AM
That's the thing though. We need to parse any sort of excel file as legacy clients like to send order forms in excel documents! Some have various headers and text above where the actual order information is. If I could convert it into something traversable, like that iterator thing that I will look at later, I could be able to do my own Regexing/parsing.

Posted by Kevinwills, 02-11-2010, 11:25 AM
I need some info abt this .xlsx files because one ad company provide their signup form in this format.Just coming to point whether we can get html frm .xlsx format i mean can we convert .xlsx file into html?

Posted by Doh004, 02-11-2010, 12:23 PM
Yes, using PHPExcel: That's the easy part.

Posted by Doh004, 02-16-2010, 01:26 AM
Sorry to bump this thread but does anyone know how to start stepping through the PHPExcel object?

Posted by mattle, 02-16-2010, 10:09 AM
I think getCellCollection() is the recommended method for iterating within a worksheet. The entire object is a collection of worksheets as well as global document settings...I'm guessing you actually just want to step through the cells on one sheet? ref: http://phpexcel.codeplex.com/Thread/...ThreadId=32487



Was this answer helpful?

Add to Favourites Add to Favourites

Print this Article Print this Article

Also Read
LayeredTech? (Views: 627)
is livehelp now down? (Views: 571)
install dkim in cpanel (Views: 576)
exportal ofline? (Views: 599)


Language:

Client Login

Email

Password

Remember Me

Search