Accepting Command Line Arguments

Let’s create a new project with, as always, cargo new. We’re calling our project minigrep to distinguish from the grep tool that you may already have on your system:

$ cargo new --bin minigrep
     Created binary (application) `minigrep` project
$ cd minigrep

Our first task is to make minigrep able to accept its two command line arguments: the filename and a string to search for. That is, we want to be able to run our program with cargo run, a string to search for, and a path to a file to search in, like so:

$ cargo run searchstring example-filename.txt

Right now, the program generated by cargo new cannot process arguments we give it. There are some existing libraries on crates.io that can help us accept command line arguments, but since you’re learning, let’s implement this ourselves.

Reading the Argument Values

We first need to make sure our program is able to get the values of command line arguments we pass to it, for which we’ll need a function provided in Rust’s standard library: std::env::args. This function returns an iterator of the command line arguments that were given to our program. We haven’t discussed iterators yet, and we’ll cover them fully in Chapter 13, but for our purposes now we only need to know two things about iterators: Iterators produce a series of values, and we can call the collect function on an iterator to turn it into a collection, such as a vector, containing all of the elements the iterator produces.

Let’s give it a try: use the code in Listing 12-1 to allow your minigrep program to read any command line arguments passed it and then collect the values into a vector.

Filename: src/main.rs

use std::env;

fn main() {
    let args: Vec<String> = env::args().collect();
    println!("{:?}", args);
}

Listing 12-1: Collect the command line arguments into a vector and print them out

First, we bring the std::env module into scope with a use statement so that we can use its args function. Notice the std::env::args function is nested in two levels of modules. As we talked about in Chapter 7, in cases where the desired function is nested in more than one module, it’s conventional to bring the parent module into scope, rather than the function itself. This lets us easily use other functions from std::env. It’s also less ambiguous than adding use std::env::args; then calling the function with just args; that might easily be mistaken for a function that’s defined in the current module.

The args Function and Invalid Unicode

Note that std::env::args will panic if any argument contains invalid Unicode. If you need to accept arguments containing invalid Unicode, use std::env::args_os instead. That function returns OsString values instead of String values. We’ve chosen to use std::env::args here for simplicity because OsString values differ per-platform and are more complex to work with than String values.

On the first line of main, we call env::args, and immediately use collect to turn the iterator into a vector containing all of the values produced by the iterator. The collect function can be used to create many kinds of collections, so we explicitly annotate the type of args to specify that we want a vector of strings. Though we very rarely need to annotate types in Rust, collect is one function you do often need to annotate because Rust isn’t able to infer what kind of collection you want.

Finally, we print out the vector with the debug formatter, :?. Let’s try running our code with no arguments, and then with two arguments:

$ cargo run
["target/debug/minigrep"]

$ cargo run needle haystack
...snip...
["target/debug/minigrep", "needle", "haystack"]

You may notice that the first value in the vector is "target/debug/minigrep", which is the name of our binary. This matches the behavior of the arguments list in C, and lets programs use the name by which they were invoked in their execution. It’s convenient to have access to the program name in case we want to print it in messages or change behavior of the program based on what command line alias was used to invoke the program, but for the purposes of this chapter we’re going to ignore it and only save the two arguments we need.

Saving the Argument Values in Variables

Printing out the value of the vector of arguments has illustrated that the program is able to access the values specified as command line arguments. Now we need to save the values of the two arguments in variables so that we can use the values throughout the rest of the program. Let’s do that as shown in Listing 12-2:

Filename: src/main.rs

use std::env;

fn main() {
    let args: Vec<String> = env::args().collect();

    let query = &args[1];
    let filename = &args[2];

    println!("Searching for {}", query);
    println!("In file {}", filename);
}

Listing 12-2: Create variables to hold the query argument and filename argument

As we saw when we printed out the vector, the program’s name takes up the first value in the vector at args[0], so that we’re starting at index 1. The first argument minigrep takes is the string we’re searching for, so we put a reference to the first argument in the variable query. The second argument will be the filename, so we put a reference to the second argument in the variable filename.

We’re temporarily printing out the values of these variables, again to prove to ourselves that our code is working as we intend. Let’s try running this program again with the arguments test and sample.txt:

$ cargo run test sample.txt
    Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs
     Running `target/debug/minigrep test sample.txt`
Searching for test
In file sample.txt

Great, it’s working! The values of the arguments we need are being saved into the right variables. Later we’ll add some error handling to deal with certain potential erroneous situations, such as when the user provides no arguments, but for now we’ll ignore that and work on adding file reading capabilities instead.

results matching ""

    No results matching ""