hadoop - How to Get Pig to Work with lzo Files? -


So, I have seen some tutorials for this online, but everyone seems to ask something different. Also, not every one of them specifies that you are trying to work on some remote cluster, or are trying to interact with a remote cluster locally ...

He said, my goal is to make a pig work with my local computer (a Mac) with laso compress files, which is present on the Hadop Cluster which is already working with the lotzo files. Has been set up Thaup FS - [command] . You can get files with .

I have a locally installed dip and when I run through a Haup cluster run script or when I just run stuff through the wagon I can load non-lizo files and play around Can i My problem is only in the case of finding a way to load the lzho file, can I just process them through an example of Elephantbird cluster? I have no information, and only the least information is found online.

Therefore, any kind of tutorial or answer for this would be great, and hopefully I will help more people than just people.

I have recently got to work and for one of my colleagues The wiki is written. Here is an excerpt wherein the way to dip to work with LIGO has been explained. Hope it helps someone!

Note: It has been written in the brain with a Mac. This step will be nearly identical to other OS, and you should definitely know how to configure it on Windows or Linux, but you will need to extrapolate in a little while (obviously, whatever Mac-centric folders Change in OS) etc ...)

Diving to be able to work with LGO

This was the most troubling and time consuming part for me - no, because it is difficult, but because 50 Various tutorials online, none of which are useful, anyway, what I have done to do this work:

  1. Clone from Hoop-Lizo on Gitubo

  2. Compile a hindipe-lizo * to get it. George and the original *. O Libraries You will need to compile it on a 64 bit machine.

  3. $ HADOOP_HOME / lib / native / Mac_OS_X-x86_64-64 / to native HDO To make a copy of Java jar $ HADOOP_HOME / lib and $ PIG_HOME / lib

  4. then configure the hadoop and pig on the java.library used. The path point for the Lzo original libraries can be done with it in $ HADOOP_HOME / conf / mapred-site.xml:

      & lt; Property & gt; & Lt; Name & gt; Mapred.child.env & lt; / Name & gt; & Lt; Price & gt; JAVA_LIBRARY_PATH = $ HADOOP_HOME / lib / country / mac_OS_X-x86_64-64 / & lt; / Pricing & gt; & Lt; / Property & gt;    
  5. Then try to open the snail by running pig and make sure everything still works. If this does not happen, then it is possible that you have made a mess in the mapped site.xml and you should check it again.

  6. Great! we're almost there. Now you just need to install elephants and birds. You can get it (can clone).

  7. Now, to get an elephant-bird to work, you'll need some pre-recycle. These are listed on the above mentioned page, and can change, so I will not specify them here. What would I tell if very are important in these versions, if you get a wrong version and try to run an ant, then you will get errors. Therefore, do not try to grab wine or pre-rake from Macpops because you will get a new version. Instead, just download Tarbal and build for each

  8. Command: To make a jar, ant in the Elephant-Bird folder.

  9. Due to simplicity, you can easily find all the relevant jars (Hop-Lis-Exorchers and Elephants-Birds-Exxurs) which you will have to register somewhere Are there. / Usr / local / lib / hasoop / ... works well.

  10. Try things out! Load common files and play around Looz in the bass shell. Register the relevant jars as mentioned above, try loading a file, limiting the output to a handful number and dumping it. All this will work properly whether you are using a normal text file or a lizard.

Comments

Popular posts from this blog

mysql - BLOB/TEXT column 'value' used in key specification without a key length -

c# - Using Vici cool Storage with monodroid -

python - referencing a variable in another function? -