Author Topic: Any scripters / Excel specialists here?  (Read 6176 times)

Offline Sven

  • Uber Member
  • ******
  • Posts: 1022
    • View Profile
Any scripters / Excel specialists here?
« on: December 06, 2012, 11:29:25 PM »
Hi all!

May be this is a bit off-topic...
Are there any scripters (Terminal or automator on Mac) / Excel specialists here?

On the FIS website there are the startlists for nearly every ski-event online.
For most of them the creation of Code replacements is really easy (as all the details are in a single row) except for ski jumping.
The lists are only provided as a table with 5 lines per athlete as PDF.
An example (copied out of the PDF) is listed below:

1
KEIL Katharina
NST Salzkammergut
AUT
19 APR 1993

I am searching for an easy way to get the multi-line stuff into a single line with only row 1, 2 and 4 (BIB, last- and first name, country).

Any one able to help / assist?

Thanks
Sven
Changed from behind the cam to one who buys images as I started to run. No cam or lens left.

Offline Hayo Baan

  • Uber Member
  • ******
  • Posts: 2466
  • Professional Photographer & Software Developer
    • View Profile
    • Hayo Baan - Photography
Re: Any scripters / Excel specialists here?
« Reply #1 on: December 07, 2012, 12:03:02 AM »
Hi Sven,

Sure, this should be rather easy to do.  If you could give me sample input and how you would like to have it formatted I can probably write you a small script to achieve this. I can probably even achieve this direct from the PDF so please also send a sample PDF to me. My email is in my signature.

Cheers,
Hayo
Hayo Baan - Photography
Web: www.hayobaan.nl

Offline Hayo Baan

  • Uber Member
  • ******
  • Posts: 2466
  • Professional Photographer & Software Developer
    • View Profile
    • Hayo Baan - Photography
Re: Any scripters / Excel specialists here?
« Reply #2 on: December 07, 2012, 02:13:55 AM »
Hi Sven, I think I really earned my title as Ubermember here now  8)

It was quite a puzzle, but I think I've got it worked-out. If you paste the following code into a script file e.g., fis, you can then extract the comma separated data using e.g., fis 2013JP3891SLT.pdf

Code: [Select]
#!/bin/sh

ps2ascii $1 | perl -e 'while (<>) { if (($nr,$last,$first,$team,$country,$birthdate) = /(\d\d?) ([\p{upper} ]+) ((?:(?<= )\p{upper}\p{lower}+ ?)+)(\p{upper}(?:.+ )+)(\p{upper}{3})(\d\d? \p{upper}{3} \d{4})/) { printf("%d,%s,%s,%s\n",$nr,join(" ",map { ucfirst lc } split(/ /,$last)),$first,$country) }  }'

Names should be properly capitalized and if you want you can also extract the team name and birth date.

Note: the command uses the ps2ascii command which is part of the Ghostscript utility. If you don't have this tool installed on your system, you can install it using e.g., MacPorts (or look for a binary Mac release of Ghostscript somewhere).

Hope this helps  :)

Here's the output of the script when run on the file you provided:
1,Keil,Katharina,AUT
2,Althaus,Katharina,GER
3,Haralambie,Dana Vasilica,ROU
4,Veshchikova,Anastasia,RUS
5,Kasai,Yoshiko,JPN
6,Lemare,Lea,FRA
7,Liu,Qi,CHN
8,Faisst,Melanie,GER
9,D Agostina,Roberta,ITA
10,Graessler,Ulrike,GER
11,Schoitsch,Sonja,AUT
12,Clair,Julia,FRA
13,Pretorius,Alexandra,CAN
14,Hirayama,Yurika,JPN
15,Gladysheva,Anastasiya,RUS
16,Yamada,Yurina,JPN
17,Watase,Ayumi,JPN
18,Avvakumova,Irina,RUS
19,Henrich,Taylor,CAN
20,Windmueller,Bigna,SUI
21,Tanaka,Atsuko,CAN
22,Tepes,Anja,SLO
23,Hughes,Abby,USA
24,Seyfarth,Juliane,GER
25,Kykkaenen,Julia,FIN
26,Rogelj,Spela,SLO
27,Logar,Eva,SLO
28,Vuik,Wendy,NED
29,Bogataj,Ursa,SLO
30,Ito,Yuki,JPN
31,Jahr,Line,NOR
32,Vtic,Maja,SLO
33,Runggaldier,Elena,ITA
34,Jerome,Jessica,USA
35,Wuerth,Svenja,GER
36,Lundby,Maren,NOR
37,Pozun,Katja,SLO
38,Seifriedsberger,Jacqueline,AUT
39,Van,Lindsey,USA
40,Mattel,Coline,FRA
41,Iraschko,Daniela,AUT
42,Insam,Evelyn,ITA
43,Vogt,Carina,GER
44,Sagen,Anette,NOR
45,Hendrickson,Sarah,USA
46,Takanashi,Sara,JPN
Hayo Baan - Photography
Web: www.hayobaan.nl

Offline Sven

  • Uber Member
  • ******
  • Posts: 1022
    • View Profile
Re: Any scripters / Excel specialists here?
« Reply #3 on: December 07, 2012, 04:34:49 AM »
Hayo,

many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many, many THANKS!

Works like a charm!

A binary version of Ghostscript for Mac working under 10.8.x could be found here: http://pages.uoregon.edu/koch/

Sven

EDIT: Found a little error when athletes have a double name with a '-'...
« Last Edit: December 07, 2012, 04:48:55 AM by SK-Foto »
Changed from behind the cam to one who buys images as I started to run. No cam or lens left.

Offline Hayo Baan

  • Uber Member
  • ******
  • Posts: 2466
  • Professional Photographer & Software Developer
    • View Profile
    • Hayo Baan - Photography
Re: Any scripters / Excel specialists here?
« Reply #4 on: December 07, 2012, 05:20:56 AM »
You're welcome! This new piece of code should fix the problem with a dash in the names  :)

Code: [Select]
#!/bin/sh

ps2ascii $1 | perl -e 'while (<>) { if (($nr,$last,$first,$team,$country,$birthdate) = /(\d\d?) ([\p{upper} -]+) ((?:(?<=[ -])\p{upper}[\p{lower}-]+ ?)+)((?<!-)\p{upper}(?:.+ )+)(\p{upper}{3})(\d\d? \p{upper}{3} \d{4})/) { $last=~s/(?<=\p{upper})([^ -]*)/\L$1\E/g; printf("%d,%s,%s,%s\n",$nr,$last,$first,$country) }  }'
Hayo Baan - Photography
Web: www.hayobaan.nl

Offline Sven

  • Uber Member
  • ******
  • Posts: 1022
    • View Profile
Re: Any scripters / Excel specialists here?
« Reply #5 on: December 07, 2012, 05:38:03 AM »
Hayo, yes it does!

Many thanks again...

Sven
Changed from behind the cam to one who buys images as I started to run. No cam or lens left.