SSH, Unix commands & RegEx

This summer I am sitting in on a computational linguistics course. It is the first instruction I have had about UNIX. Pretty Awesome.
This has required me to do some googling looking from terminal commands.

This is kind of a sketch of where I have been.

UNIX:
http://www.osxfaq.com/Tutorials/LearningCenter/

SSH:
http://kimmo.suominen.com/docs/ssh/
http://ss64.com/osx/

TERMINAL:
http://homepage.mac.com/rgriff/files/TerminalBasics.pdf

grep:
http://www.thegeekstuff.com/2009/03/15-practical-unix-grep-command-examples/
http://en.wikipedia.org/wiki/Grep
http://www.computerhope.com/unix/ugrep.htm

Regular Expressions:
http://www.zytrax.com/tech/web/regex.htm
http://www.regular-expressions.info/tutorial.html
http://gnosis.cx/publish/programming/regular_expressions.html

RegEx and Unicode:
One of the issues that I have had with RegEx has been what is a natural class? i.e. [A-Z], [A-Za-z], [0-9], etc. As a linguist I deal a lot with IPA characters, subscripts, superscripts, unicode, and diacritics. How am I to define a natural class with these? Can I define a natural class based on the phonology of the language?

So I did some more searching:
http://unicode.org/reports/tr18/
http://unicode.org/reports/tr18/tr18-5.1.html
http://icu-project.org/docs/papers/iuc26_regexp.pdf
http://courses.ischool.berkeley.edu/i256/f06/papers/regexps_tutorial.pdf
http://wapedia.mobi/en/Regular_expression?t=5.

RegEx+PERL+Unicode:
http://perldoc.perl.org/perlretut.html

PERL:
http://www.enginsite.com/Library-Perl-Regular-Expressions-Tutorial.htm
http://www.cgi101.com/book/connect/mac.html
http://www.mactech.com/articles/mactech/Vol.18/18.09/PerlforMacOSX/index.html

Python:
http://www.amk.ca/python/howto/regex/

Social Network Marketing

My friend Abbie, (Facebook, MySpace) is currently in a competion to perform live with Ingrid Michaelson. She is also in first place currently. (Go ahead vote for her. http://www.ingridmichaelson.com/videocontest/vote/ Her’s is #6 in the top ten listing.)

That is not what is the most interesting though.

What is interesting is the social networking going on to get all the votes needed.

Someone created an Open Event on Facebook. Abbie has about 1700 Facebook friends and a fan page. But by creating an open facebook event other people could envite their friends to the event. So now there are over 11,500 people who have been invited to the event! That is 10 times the number of people that Abbie knows. And this has only been three or four days running.

When people respond to the event then there is an option for a personal message. Followed by clear instructions (and links) on the event page describing how to vote. The event has gone viral. That is the point of Social Network Marketing.

I wonder if I created an event for my business purposes if it would fly. I only have 500 friends so to reach the 10x number we would only need to send out 5000 invites.

You can follow Abbie’s Youtube channel.

Selected Works™ & BePress

Bepress is an internet based service from Berkeley Electronic Press. They basically allow a user to display their work. i.e. a Professor’s CV which has a list of publications, those publications are then displayed by Selected Works™ & BePress as downloadable PDFs. These works can then also be described, downloaded, their bibliographic references can be downloaded, etc. Berkeley Electronic Press archives the Documents and presents them in an understandable, accessible, usable format. They have integrated Google search. Seems like lots of Love all around.
This kind of thing would be really good for an organization like the one I work with.


This is a picture of a page. See this page live.


This is a picture of a page. See this page live.

BePress has a lot of features, it integrates with a lot of other services too. One service which looked really cool was their service for working with the editorial process used in working papers.


Simple Linguistics software

I often see good (maybe not sexy), software, like iBable designed on the Mac for scientific purposes. I often wonder, “Why hasn’t anyone done something for or with linguistics?” linguistics is a big field. Don’t get me wrong. It is also a field with few standardizations for data interoperability, and even fewer standards for data description and markup. Just seeing something like iBable is inspiring to want to learn Ruby and do something for linguistic data.

The Apple developer program is only $99 a year.
Tutorial on Ruby by Phusion.

SSH and Terminal

I used an ssh connection from the Terminal today for the first time!

Picture of Apple Terminal

Terminal

I feel like a real man now.
I needed to transfer a 106MB folder from one subdomain to another subdomain on my DreamHost webserver. It has been my experience that whenever I copy or move folders with a lot of sub-folders that something(s) do(es) not get copied all the time or all the way. So I needed to archive my files and move them as a single object. But I do not think it is possible to zip files with an FTP client (at least not with Interarchy). For a solution I turned to ssh and a lot of googling.

So to ssh into my webhost I had to enable a user from the DreamHost panel.

Picture of  panel to Enable ssh for user on DreamHost.

Panel to Enable ssh for user on DreamHost.

User Account Type Page at DreamHost

User Account Type Page at DreamHost

Second image from another tutorial.

Then I had to open terminal and create a key. I found some sensible directions in the knowledge base.

    To generate a secure public/private key pair to log in securely, and without a password (if you want):

  • In Terminal type: ssh-keygen -d
  • Hit the “enter” key three times.

    Replacing “username” and “yourdomain” with your FTP username and your-domain,

  • copy & paste/type the following into Terminal:

    ssh username@ftp.yourdomain.com 'test -d .ssh || mkdir -m 0700 .ssh ; cat >> .ssh/authorized_keys && chmod 0600 .ssh/*' < ~/.ssh/id_dsa.pub

  • Press return/enter key again.
    Wait for it to ask for the Password:

    Enter the password of the FTP user who's username you inserted in place of the example USERNAME@ftp.yourdomain.com above.
    If it asks you for the password multiple times, type in the same correct password each time.

    Then you will be at the root in your Terminal window.

  • type: ssh username@ftp.yourdomain.com
  • You're logged in!
    Now any time you want to log using SSH you can just repeat
    ssh username@ftp.yourdomain.com
    from the command line (Terminal), no need to repeat the other steps.

So from here on I was in my webhost but still didn't know how to get around. Evidently I needed to use long paths so $ cd /home/username/directory would move me from directory to directory. I could not just $ cd /directory.

Once I was able to get to the directory I needed to archive, I still needed the archive commands.

I thought I wanted to use zip as my archive utility. The zip command to do that would be:
$ zip -r folder.zip folder
Though my friend Daniel said that I might should have used tar gunzip tar.gz instead of using the zip command: "Zip compresses each file separately and then archives. Tar+gzip or tar+bzip2 archives first and then compresses."

The commands to use the tools Daniel suggested would be like the following:

tar+gzip
$ tar -cf blah.tar folder/
$ gzip -9 blah.tar

gzip compressed tar I guess this is a combination of the above two commands. Not sure. Didn't try it.
$ tar czvf folder.tgz folder

bzip2
$ tar jcvf filename.tbz folder

After the file was compressed I used Interarchy to move the single zip file to its new location. I also needed to unzip the file. (I also read this.)
To unzip the file I navigated to the directory where the file was located and then used this command:
$ unzip folder.zip folder
I had to use the long path too. So it was really:
$ unzip /home/username/directory/folder.zip folder

What a sense of accomplishment!

Merging iLife Libraries

The Problem:
One user on in a small business / family network can’t use (with metadata) all the media in a colleague’s or family member’s iTunes or iPhoto Library.

In our family there are three Macs (2 everyday machines and a server). On many work and personal tasks we function as a small workgroup. Unfortunately iTunes and iPhoto do not facilitate the sharing of media libraries (or for that matter the merging of media libraries). For instance, my wife had her own music and photo collection before we got married. Now if I want to browse that collection from my machine, there is iPhoto & iTunes sharing. But I can not add tags or other metadata to photos on her Mac. I can not create smart folders which we both can use.

iTunes
For our music we moved my collection to the Server and made it like a “media center”. When we get new music we add it to the server. If we want a copy on our own machines we pull it as needed. i.e. for an iMove project. This solution has not allowed my wife to add her collection to the server, nor has it solved the manny duplicates which exist because we like many of the same songs. Now I have found a solution to this: PowerTunes.

iPhoto
Now the same problems exist for our photos. However, there is no real advantage (or software) for hosting the family photos on our sever. But we still need to define a photo capture strategy.

  • When we take new photos, to which computer are we going to download the photos?
  • Where will we have the master library?

I don’t have a complete solution to our photo capture, retention and access needs but iPhoto Library Manager is the only software out there that will let us maintain the metadata and merge our iPhoto Libraries. However, This is a fantastic first step strategy:

  • Consolidate the iPhoto Libraries.
  • Designate an computer to be the Master Library holder.
  • Share that iPhoto library across the network.
  • Back that computer up.

ProfDev Data Tracking

My wife has been tasked to be the Professional Development Coordinator for the company at which we work. Her task has several interesting things about in the area of data tracking. One question needing to be asked is: “what are the experiences and skills of our current employees?” This suggests that a databases with cross sections of professionally related events, people and skills is needed. These data then need to be able to be viewed by various stakeholders so that the data can be read and analyzed and understood; eventually to be acted upon and incorporated into company strategies for doing business.

One of the things that is obvious from the start is that a web based collection system is need for the data. A storage solution is also called for. And finally an web based analysis tool for presenting the data in a variety of manners for final use is needed.

So in an effort to help my wife out I have been looking a OpenSource implementations of Resume databases and CV building Databases. It has been my experience that when it comes to IT solutions that people need unique implementations and have unique criteria to meet but do not have unique problems. I think I even found a service that provides some professional development tracking called Onefile. But for our company it makes sense to approach this problem with an eye to integrate it with other corporate IT infrastructure, rather than silo it as an outsourced the system.

Summaries of Goals

This effort to take a strategic look a professional development of employees is part of an effort to look holistically at the corporation’s pool of human talent. The motivation is to be able to strategically deploy our skills in a manner where there is the largest return on investment. It is also important for us to be able to present our talented people and the products of their efforts to the world; both for credibility and for marketing.

Difficulties in the business world

There are quite a few legal challenges for companies (working in the U.S., Europe, and elsewhere) retaining these kinds of records, let alone sharing them with business partners.

Social networks are notorious for being able (if they are successful networks) pull information from users easily.

The data to be tracked

Facebook CV/Resume Creator apps

Easy CV Creator EasyCV Curriculum Vitae

Example YouTube video: httpv://www.youtube.com/watch?v=egEadu5EUjI
CV Creative

Not popular…

My CV
Works with the http://moncv.com/ service.

Resume Factory

My Resume

1.5 of 5

Resume Central

Captain ResumeCaptain Resume
http://www.facebook.com/apps/application.php?ref=sgm&id=23892177864
3.5 of 5

LinkedIn:
Share it on facebook
http://www.facebook.com/apps/application.php?id=6394109615&ref=appd

Opensource stuff:
http://www.kite-eu.org/kite/en/download/
http://digitaldisruptions.org/rhizome/

http://www.margaperez.com/2009/08/resurfacing-the-kite-europass-cv-plug-in-for-wordpress/
http://digitaldisruptions.org/rhizome/2009/10/12/updating-the-application-profile-of-the-europass-cv-based-on-hr-xml-candidate-specifications-3-0/

http://digitaldisruptions.org/rhizome/2009/08/06/resurfacing-the-kite-europass-cv-plug-in-for-wordpress/comment-page-1/#comment-158
HR-XML:
http://www.sarmsoft.com/product/resumebuilder/

Plone:
http://plone.org/products/faculty-cv

Java:
http://gestcv.sourceforge.net/
http://sourceforge.net/projects/gestcv/
http://sourceforge.net/projects/lusid/

hResume:
Could not find a creator which worked

Conference Management
http://pkp.sfu.ca/?q=ocs
http://www.conftool.net/
http://sourceforge.net/projects/wcmt/
Conman
http://github.com/herlo/ConMan
http://blog.utos.org/2008/01/31/utosf-hacknight-a-grand-success/
http://conman.utosc.com/pages/home/
http://code.google.com/p/utos-conman/
registration
http://code.google.com/p/scalereg/

Coraga
http://corga.sourceforge.net/

Drupal conference registration
http://drupalmodules.com/module/conference

Social Network
http://www.boonex.com/dolphin/
http://www.patrick-opitz.com/projects/facelift/about/
http://www.xoops.org/ This is an open source social network which looks interesting but I am not sure how much momentum is behind it.

4.5 out of 5
This social network looks really cool and targets the e-portfolio
http://mahara.org/

Services:
http://en.easy-cv.com/

People Exist in Space and Time

The situation though is that everything that goes into a resume or a CV after biographical information is an event in which the person was involved, a skill they have or a resource they have helped to create. So if we could automatically pull information from the events and resources and then organize them according to Who then we would almost be there. (I am not sure how our company is tracking these kinds of information. It is most likely in a MS Word document.)

Events have several attributes one of those is time.

This is course management software: with calendars and DHTML in Video.
http://www.olat.org/website/en/html/about_features.html

NO CALENDAR….
http://www.davical.org/
http://www.bedework.org/bedework/update.do
http://trac.calendarserver.org/wiki/CalDAVTester

So How do we pull data from the container which holds our resources?
Well the container holding our resources is DSpace.
But these options work with wordpress….
http://wordpress.org/extend/plugins/wikindx-macro-plug-in-for-wordpress/
http://wikindx.sourceforge.net/
http://refdb.sourceforge.net/features.html

http://simile.mit.edu/wiki/Citeline_Developer%27s_Guide
http://simile.mit.edu/wiki/Citeline_User%27s_Guide

Example resumes
http://matthewlevine.com/resume

Why do we need a Resume now that I have a Job?
http://optional.is/required/2010/02/01/have-gun-will-travel/