Unix Intro 1

Unix

"Ken Thompson (sitting) and Dennis Ritchie at PDP-11 (2876612463)" by Peter Hamer - Ken Thompson (sitting)-&-Dennis Ritchie at PDP-11Uploaded by Magnus Manske. Licensed under Creative Commons Attribution-Share Alike 2.0 via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Ken_Thompson_(sitting)_and_Dennis_Ritchie_at_PDP-11_(2876612463).jpg#mediaviewer/File:Ken_Thompson_(sitting)_and_Dennis_Ritchie_at_PDP-11_(2876612463).jpg

Unix is a family of operating systems, stretching back to its origins at AT&T Bell Laboratories in 1969. Unix introduced many of the features now common in virtually every operating system.

 

Unix is now everywhere

  • Amazon Kindle is Linux
  • Android is Linux
  • Apple iOS is a version of BSD Unix
  • Mac OS X is a version of BSD Unix too
  • most of the web servers on the Internet
  • the server farms of Amazon, Facebook, Google…

This Talk is Not About Linux

Linux is a free, open source operating system that implements the features of Unix. You can think of it as a Unix lookalike, or (more properly) a Unix work-alike. Note that all Unix operating systems are Unix work-alikes.

In this talk, we’re going to talk about Unix, not Linux–but everything we talk about applies to Linux, because Linux is a form of Unix.

But Here is How To Get Linux

You can get Linux for free from the following distributors:

For folks who are just getting started, Fedora or Ubuntu are probably the right place to start.

The Shell

When people think of Unix, they think of this:

[alice@desktop] ~$ ssh abc123@someserver.nyu.edu
$ cd /var/log
$ zcat -f messages messages-20121014.gz | grep -i ssh > out && cat out
Oct 12 10:12:59 wpn1 yum[14228]: Updated: openssh-5.3p1-81.el6.x86_64
Oct 12 10:14:00 wpn1 yum[14228]: Updated: openssh-server-5.3p1-81.el6.x86_64
Oct 12 10:14:01 wpn1 yum[14228]: Updated: openssh-clients-5.3p1-81.el6.x86_64
Oct 12 10:14:12 wpn1 yum[14228]: Updated: libssh2-1.2.2-11.el6_3.x86_64
$ wc -l out
4

This is the “shell,” a text-based way to interact with Unix. You can manage files, run programs, and do most anything using the shell.

A Shell Session Example

Let’s look at a simple shell session:

[alice@server ~]
[alice@server ~] pwd
/home/alice/example
[alice@server ~] ls
input.txt
[alice@server ~] cat input.txt
Unix was created at AT&T Bell Laboratories.
Bash (the "Bourne Again Shell") is a shell for Unix.
Unix provides basic text utilities like grep and diff.
[alice@server ~] grep Lab input.txt
Unix was created at AT&T Bell Laboratories.
[alice@server ~] exit

Anatomy of a Shell Session

Now let’s go into some details.

This is a shell prompt, which can look different on different systems:
alice@server ~]

The user types the command pwd , and then the shell prints the name of the present working directory ( /home/alice/example ):
[alice@server ~] pwd
/home/alice/example

The user types the command ls , and then the shell prints a list of files in the current directory (there is just one file in this example):

[alice@server ~] ls
input.txt

The user types the command cat input.txt , which prints out the contents of the file input.txt :

[alice@server ~] cat input.txt
Unix was created at AT&T Bell Laboratories.
Bash (the "Bourne Again Shell") is a shell for Unix.
Unix provides basic text utilities like grep and diff.

The user types the command grep with some arguments, and the shell runs that command, printing the command’s output:
[alice@server ~] grep Lab input.txt
Unix was created at AT&T Bell Laboratories.

The user ends the shell session:
[alice@server ~] exit

Prompt, Command, Output, Repeat

Your interaction with the shell will usually feel like the preceding example.  You will:

  • see a shell prompt
  • issue a command
  • see the output of the command
  • then you will see the shell prompt again, and repeat the process above

And when you are finished running commands, you will type exit to end the shell session.

Philosophy: Small is Beautiful

Unix encourages you to use small programs that each do one thing well. For example, the command ls will only list the files present in directory. That’s it. If you want to read a file, you use a different program (such as cat or less ).

You can chain these programs together in flexible ways (more about that later) or collect a series of commands into a shell script, which is a program you create (more about that later).

Philosophy: Do One Thing Well

This is Doug McIlroy from Bell Labs,  discussing the early design of Unix.
This is Doug McIlroy from Bell Labs, discussing the early design of Unix.

The philosophy that everyone started to put forth was ‘Write programs that do one thing and do it well. Write programs to work together. Write programs that handle text streams, because that is a universal interface.'”

Running a Program: $PATH

To run a program, you type in the program’s name at the shell prompt. The shell looks for the program in a list of directories. This list of directories is called your path. The list is stored in a special shell variable called $PATH .

$ echo $PATH
/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin

If the program is not in one of those directories, the shell won’t be able to find it.

A Simple File System

A “file system” is nothing more than a way to store data on a computer. It is analogous to a real world file cabinet that:

  • contains many paper files
  • has its files typically organized into folders that group related files
  • has folders can contain other folders, which can themselves contain files and folders

Philosophy: Files are Stored in a Tree

Unix took this file cabinet analogy and decided to store files in a tree-like structure. This is the way most file systems work now.

Think about a real tree first:

  • A tree has leaves, branches, and a root
  • A leaf can only belong to one branch
  • A branch can have leaves and sub-branches
  • All of the branches are all connected to the root, and to each other

File System Tree

Unix file system tree

A Shell Session Example, Revisited

[alice@server ~]
[alice@server ~] pwd
/home/alice/example
[alice@server ~] ls
input.txt
[alice@server ~] cat input.txt
Unix was created at AT&T Bell Laboratories.
Bash (the "Bourne Again Shell") is a shell for Unix.
Unix provides basic text utilities like grep and diff.
[alice@server ~] grep Lab input.txt
Unix was created at AT&T Bell Laboratories.
[alice@server ~] exit

File System Tree Detail

Here’s the file that we saw in the “shell session example revisited,” which is /home/alice/example/input.txt:

Unix file system tree detail

Aside: File System as Inverted Tree

You’ll frequently see the Unix file system shown as an inverted tree. Don’t worry. Everything is OK.

Unix file system inverted tree

Directories, Files, and Slash

  • The root is the directory at the top of the file system, named / in Unix
  • The root of the filesystem contains directories (e.g. /usr or /tmp ) and files
  • Each directory can contain files and/or more directories
  • The forward slash character is the delimiter between directories

Examples:

  • /
  • /usr/bin/grep
  • /home/bob
  • /home/alice/example/input.txt

Anatomy of a File Path Name

/home/alice/example/input.txt

Breaking it down:

  • A directory named /
  • contains a directory named home
  • which contains a directory named alice
  • which contains a directory named example
  • which contains a file named input.txt

Notice that the forward slash character indicates each directory boundary.

Absolute Path vs. Relative Path

A path name that starts with / is an “absolute path.” This is the full name of the file, with no ambiguity.

You can also specify a file by a “relative path,” which means the shell will look for it based on the present working directory. This is why you don’t have to specify the full path name of a file within the current directory.

Parent Directory: Using “..”

When using relative path names, you may be wondering how you refer to a file that is in a parent directory. The Unix file system provides a way to do this using the file “..” present in every directory. Example:

$ pwd
/home/bob
$ ls ../alice/example
input.txt
$ cd ../../
$ pwd
/

Philosophy: A File is a Sequence of Bytes

Another Unix innovation that has become commonplace is the idea that a file is just a sequence of binary data. Before Unix, it was very difficult to do simple file reading and writing, because the OS enforced structure on files.

In Unix, it is very easy to open a file, read it from start to finish, write to the end of it, overwrite it completely…

Aside: File Extensions (like .txt)

In Unix, file extensions are only customary. If you rename a text file from output.txt to output.mov , it is still a text file. Unix does not care about the contents of your files. A file is just a sequence of bytes.

You can use the command file to inspect a file’s contents:

$ ls
perl-PBS.spec
$ file perl-PBS.spec
perl-PBS.spec: UTF-8 Unicode English text

Basic File System Utilities

  • cd  (change directory)
  • cp (copy a file)
  • ls (list files)
  • mkdir (make directory)
  • mv (move a file or directory)
  • rm (remove a file)
  • rmdir (remove an empty directory)

Less-Basic File System Utilities

We’ll talk about some of these:

  • chgrp (change the group associated with a file)
  • chmod (change the permission mode associated with a file
  • chown (change the owner associated with a file)
  • cp -r (copy a directory recursively)
  • find (search for files/directories)
  • ls -l (list files, showing more info)
  • ln and ln -s (create links, which are aliases for files)
  • rm -r (remove files and/or directories recursively)

If You Don’t Know, Ask the man

Most Unix utilities have a manual, which you can find by using the utility man . Type man followed by the name of a command and you will probably find that command’s manual:

$ man cat
$ man file
$ man ls
$ man man

Philosophy: Text is Wonderful

This is Doug McIlroy from Bell Labs, discussing the early design of Unix.
You’ll see Doug one more time today.

The philosophy that everyone started to put forth was ‘Write programs that do one thing and do it well. Write programs to work together. Write programs that handle text streams, because that is a universal interface.‘”

Plain Text Utilities are Wonderful

You’ll find that plain text has a privileged place in Unix. There are many utilities that manipulate text and text files. Some of the most basic utilities are:

  • cat (concatenate files)
  • diff (show differences between text files)
  • echo (print some text to the screen or a file)
  • grep (find a text string in a file)
  • head (show the first few lines of a file)
  • less (interactive program to view a text file)
  • sort (rearrange the lines of a file into order)
  • tail (show the last few lines of a file)
  • wc (count the number of words in a file)

What is Text?

A lot of common file types are actually just text files:

  • Email
  • Genomic sequencing data
  • HTML web pages
  • JavaScript
  • Java, Perl, Python–most source code
  • Shell commands

Pipes

A “pipe” allows you connect the output of one program to the input of another. This is another key Unix innovation. Let’s take a look at an example:

$ tail -f /var/log/messages | grep SEVERE
SEVERE: Context [/wasp] startup failed due to previous errors
Oct 18, 2012 12:50:03 PM org.apache.catalina.loader.WebappClassLoader clearReferencesJdbc
SEVERE: The web application [/wasp] registered the JDBC driver [com.mysql.jdbc.Driver] but failed to unregister it when the web application was stopped. To prevent a memory leak, the JDBC Driver has been forcibly unregistered.
Oct 18, 2012 12:50:03 PM org.apache.catalina.loader.WebappClassLoader checkThreadLocalMapForLeaks

Philosophy: Programs Should Work Together

This is Doug McIlroy from Bell Labs, discussing the early design of Unix.
You’ll see Doug zero more times today.

The philosophy that everyone started to put forth was ‘Write programs that do one thing and do it well. Write programs to work together. Write programs that handle text streams, because that is a universal interface.'”

Shell Redirection

When you run a program, typically that program will produce output of some kind (possibly text). It might also require input from you. Unix allows you to tell a program:

  • where to write its output (into a file)
  • where to read its input (from a file)

This is called shell redirection. You tell the shell to redirect by using the redirectors, the greater-than and less-than symbols.

Shell Redirection: Read with <

Previously, you saw how the input of a program can be connected to the output of another program using a pipe. But a program’s input can also be connected to a file.

For instance, a program that expects text to be input from the user can read that text from a file instead. You do this with the less-than sign:

$ someprogram < input.txt

You might not see or use the input redirector often.

Shell Redirection: Overwrite with >

Usually, a program outputs text to the shell and you see it after the command is executed:

$ echo hello
hello
But you can tell the shell to send that output to a file instead:
$ echo hello > output.txt
$ cat output.txt
hello

Be careful, because this can overwrite an existing file:

$ echo byebye > output.txt
$ cat output.txt
byebye

Shell Redirection: Append with >>

If you don’t want to overwrite the file, you can append to the end of the file instead using the double greater-than redirector:

$ cat output.txt
byebye
$ echo "hello again" >> output.txt
$ cat output.txt
byebye
hello again

When Something Goes Wrong, 2>

If a program encounters an error, it generally writes error messages to your shell session. This can be annoying, because you might have error text mixed in with output.  Unix allows you to specify a file destination for error output like this. To send errors to a file (presumably for future review), use the 2> redirector:

$ find /tmp 2> /home/bob/find-errors.txt

Standard Output, Standard Error

Unix makes a distinction between “output” and “errors.” Generally, you are only concerned with the output from a program, called “standard output.” But if something goes wrong, the program can write errors to something called “standard error.” This allows you to redirect regular output to one file, while redirecting error output to another file:

$ find /tmp > find-output.txt 2> find-errors.txt

Processes

When a program is started, Unix calls the running program a “process.” Unix keeps track of all processes on the system, and you can see your processes using two important commands:

  • ps (one time listing of running programs)
  • top (interactive, continuous display of running programs)

Once the program is finished, the process is no longer visible in ps or top .

Process ID (PID)

Every process has a unique number assigned to it, called the “process identifier” or PID.

$ ps
PID TTY          TIME CMD
21719 pts/0    00:00:00 bash
22018 pts/0    00:00:00 grep
22019 pts/0    00:00:00 ps
$

Background

When you run a program, the shell waits for it to complete. Until the program completes, you cannot use the shell for anything else. But if you end your command with an ampersand, the shell will run the command in the background, allowing you to continue using the shell for other tasks.

$ grep ACTG reallylongfile.fastq &
[1] 22023
$ ls
reallylongfile.fastq

Foreground

You can bring a backgrounded process back to the foreground using the command fg :

$ grep ACTG reallylongfile.fastq > out &
[1] 22023
$ ls
out reallylongfile.fastq
$ less out
$ fg

Once you type fg , the shell unpauses grep and brings it back to the foreground, and the shell waits for it to complete.

Process Signals

You can use the PID to send a signal to the program. The most common signal is to tell the program to stop, usually by sending it either the terminate signal, or the kill signal. You send a signal using the kill command and the PID, like this:

$ ps
PID TTY          TIME CMD
21719 pts/0    00:00:00 bash
22020 pts/0    00:00:00 grep
$ kill 22020
[1]+  Terminated            grep ACTG reallylongfile.fastq

Process Signal: Control C

The other common way to send a signal to a process is to simultaneously press the control and C buttons on your keyboard. This sends an “interrupt” signal to the process, which usually terminates the process.

$ grep ACTG reallylongfile.fastq
^C
$

You’ll frequently see the control C sequence written as ^C (the ^ used as a placeholder for the unprintable character associated with the control key).

Putting It Together: Shell Scripts

If you create a text file with a Unix command on each line, you have created a shell script. This technique is commonly used in Unix to build programs that tie together the Unix utilities and other software on the system.

We’ll leave discussion of shell scripts for another time. For now you should just know that a shell script is not magic, it just contains Unix commands.

Unix is a Multi-User System

Many users can use a single Unix system at the same time. Unix has ways of protecting users from each other, and the most important protection is arguably file system permissions.

Every user on the system has a unique name (called a user identifier or UID) and can belong to one or more groups. Each group has a unique name (called a group identifier or GID).

Users and Groups

Who am I?

$ whoami
alice

What groups am I in?

$ groups
alice aliceandbob team1 users

Am I successful?

$ successful
-bash: successful: command not found

Permissions

Every file and directory has a user and group associated with it, and a set of permissions. You can see all of this with ls -l :

$ ls -l
total 72
drwxrwxr-x 3 alice team1        53  Mar 28  2012 build
-rwx------ 1 alice alice        62  Jul 11 11:59 crushcpu.pl
-rwx------ 1 alice alice        115 Jun 28 09:59 crushmem.pl
drwxrwx--- 5 alice aliceandbob  139 Jul 26 14:37 jobs
drwxrwxr-x 3 alice users        45  Jul 17 11:27 R

Permissions for ugoa

Each file or directory has a user and group associated with it. When you set permissions, you decide:

  • what permissions the owner of the file gets
  • what permissions the group of the file gets
  • what permissions everyone else gets

When we talk about these classes of users, we’ll abbreviate these as u, g, o, or a:

  • u – the user who owns the file
  • g – the group associated with the file
  • o – other users that are not the owner, and not in the group
  • a – every user on the system

Anatomy of ls -l Output: 1

Each line of output shows either a file or directory. The first column tells you:

  • whether it is a file or a directory
  • the permissions set for the file

$ ls -l
total 72
drwxrwxr-x 3 alice team1        53  Mar 28  2012 build
-rwx------ 1 alice alice        62  Jul 11 11:59 crushcpu.pl
-rwx------ 1 alice alice        115 Jun 28 09:59 crushmem.pl
drwxrwx--- 5 alice aliceandbob  139 Jul 26 14:37 jobs
drwxrwxr-x 3 alice users        45  Jul 17 11:27 R

The All-Important First Column

drwxrwxr-x 3 alice team1        53  Mar 28  2012 build
-rwx------ 1 alice alice        62  Jul 11 11:59 crushcpu.pl

The first column has 10 characters in it:

  • The first character is either:
    • d (indicating this is a directory) or
    • – (indicating this is a file)
  • The next three characters show the permissions for the owner (u)
  • The next three characters show the permissions for the group (g)
  • The next three characters show the permissions for other users (o)

We’ll talk about what rwx means in a minute.

Anatomy of ls -l Output: 2 and 3

The second column (hard links) can be ignored for now. The third column shows you the owner:

$ ls -l
total 72
drwxrwxr-x 3 alice team1        53  Mar 28  2012 build
-rwx------ 1 alice alice        62  Jul 11 11:59 crushcpu.pl
-rwx------ 1 alice alice        115 Jun 28 09:59 crushmem.pl
drwxrwx--- 5 alice aliceandbob  139 Jul 26 14:37 jobs
drwxrwxr-x 3 alice users        45  Jul 17 11:27 R

Every file has an owner, which is usually the user that created the file.

Anatomy of ls -l Output: 4

The fourth column shows the group associated with each file/directory:

$ ls -l
total 72
drwxrwxr-x 3 alice team1        53  Mar 28  2012 build
-rwx------ 1 alice alice        62  Jul 11 11:59 crushcpu.pl
-rwx------ 1 alice alice        115 Jun 28 09:59 crushmem.pl
drwxrwx--- 5 alice aliceandbob  139 Jul 26 14:37 jobs
drwxrwxr-x 3 alice users        45  Jul 17 11:27 R

Every file has a group associated with it, usually the primary group of the user that created the file. This can be changed via the chgrp command.

Anatomy of ls -l Output: 5

The fifth column shows the size of the file/directory in kilobytes:

$ ls -l
total 72
drwxrwxr-x 3 alice team1        53  Mar 28  2012 build
-rwx------ 1 alice alice        62  Jul 11 11:59 crushcpu.pl
-rwx------ 1 alice alice        115 Jun 28 09:59 crushmem.pl
drwxrwx--- 5 alice aliceandbob  139 Jul 26 14:37 jobs
drwxrwxr-x 3 alice users        45  Jul 17 11:27 R

Anatomy of ls -l Output: 6

The sixth column shows the date of last change:

$ ls -l
total 72
drwxrwxr-x 3 alice team1        53  Mar 28  2012 build
-rwx------ 1 alice alice        62  Jul 11 11:59 crushcpu.pl
-rwx------ 1 alice alice        115 Jun 28 09:59 crushmem.pl
drwxrwx--- 5 alice aliceandbob  139 Jul 26 14:37 jobs
drwxrwxr-x 3 alice users        45  Jul 17 11:27 R

Anatomy of ls -l Output: 7

The last column shows the path name:

$ ls -l
total 72
drwxrwxr-x 3 alice team1        53  Mar 28  2012 build
-rwx------ 1 alice alice        62  Jul 11 11:59 crushcpu.pl
-rwx------ 1 alice alice        115 Jun 28 09:59 crushmem.pl
drwxrwx--- 5 alice aliceandbob  139 Jul 26 14:37 jobs
drwxrwxr-x 3 alice users        45  Jul 17 11:27 R

File Permissions: rwx

For any file, the file’s owner can specify whether a particular user or group can:

  • read the file (abbreviated r)
  • write/overwrite the file (abbr. w)
  • execute the file (abbr. x)

Directory Permissions: rwx

For any directory, the OS allows users to specify whether a particular user or group can:

  • list the contents of the directory (abbr. r)
  • add, rename, or delete files in the dir (abbr. w)
  • change directory into the directory (abbr. x)

Note that the same rwx is used here, but it means something slightly different than it did with files.

Symbolic Permissions, ugoa

The easiest way to manage permissions is using what are called “symbolic permissions.” This is where we use the u, g, o, and a that we talked about earlier. Examples:

Allow the group to read and write the file:

$ chmod g+rw output.txt

Allow the owner and the group to read, write, and execute the file:

$ chmod ug+rwx output.txt

Take away all privileges from anyone who is not the owner or in the file’s group:

$ chmod o-rwx output.txt

Absolute Permissions

We’ll leave discussion of absolute permissions for another time. For now, just know that absolute permissions use numbers to specify the permissions (such as 755 or 600) instead of the symbolic permissions we just saw. Examples:

Allow the file owner full permission, allow the group to read and write the file, and allow everyone else to read:

$ chmod 764 output.txt

For the same file, take away all privileges from anyone who is not the owner or in the file’s group:

$ chmod 760 output.txt

Philosophy of Unix

Let’s review the bits of Unix philosophy we have discussed:

  • Small is Beautiful, AKA Do One Thing Well
  • Files are Stored in a Tree
  • A File is a Sequence of Bytes
  • Text is Wonderful
  • Programs Should Work Together
  • Build Your Own Programs