Functional programming of Java of: MapReduce

  • 2020-04-01 03:30:54
  • OfStack

: map (maps) and reduce (reduction, reduction) are two very basic math concepts, they appear very early in the various functional programming language, until 2003, when Google to carry forward, used in a distributed system for parallel computing, after the name of this combination began in the computer world shine (those functional powder may not think so). In this article, we'll see the first appearance of the map and reduce combination in Java 8 after it changed to support functional programming (this is just an introduction, and there will be a feature on them later).

Reduce the set

So far we have introduced a few new techniques for manipulating sets: find matching elements, find individual elements, and transform sets. One thing these operations have in common is that they all operate on a single element in the collection. There is no need to compare elements or to operate on two elements. In this section we'll look at how to compare elements and dynamically maintain an operation result as you walk through the collection.

Let's start with a simple example and work our way up. In the first example, let's walk through the friends collection and calculate the total number of characters for all the names.


System.out.println("Total number of characters in all names: " + friends.stream()
         .mapToInt(name -> name.length())
         .sum());

To figure out the total number of characters we need to know the length of each name. You can do this easily with the mapToInt() method. Once we've converted the names to their corresponding lengths, we just add them together. We have a built-in sum() method to do this. Here is the final output:


Total number of characters in all names: 26

We used a variant of the map operation, the mapToInt() method (which has mapToInt, mapToDouble, etc., which generates streams of a specific type, such as IntStream, DoubleStream, etc.), and then calculated the total number of characters based on the length returned.

In addition to the sum method, there are many similar methods available, such as Max () for the maximum length, min() for the minimum length, sorted() for the length, average() for the average length, and so on.

Another interesting aspect of this example is the increasingly popular MapReduce pattern, where the map() method maps, and the sum() method is a common reduce operation. In fact, the sum() method in the JDK is implemented using the reduce() method. Let's look at some of the more common forms of reduce operations.

Let's say we go through all the names and print out the one with the longest name. If there are more than one longest name, we print out the first one we found. One way to do it is, we figure out the maximum length, and then we pick the first element that matches that length. But doing so requires walking through the list twice -- too inefficient. This is where the reduce action comes in.

We can use the reduce operation to compare the lengths of two elements, then return the longest one and compare it further with the rest. Like other higher-order functions we've seen before, the reduce() method also traverses the entire set. In addition, it records the evaluation results returned by lambda expressions. An example to help us understand this, let's look at a piece of code.


final Optional<String> aLongName = friends.stream()
         .reduce((name1, name2) ->
            name1.length() >= name2.length() ? name1 : name2);
aLongName.ifPresent(name ->
System.out.println(String.format("A longest name: %s", name)));

The lambda expression passed to the reduce() method takes two arguments, name1 and name2, and it compares their lengths, returning the longest. The reduce() method doesn't know what we're doing. This logic is stripped away into the lambda expression we passed in -- a lightweight implementation of the policy pattern.

This lambda expression fits nicely into the apply method of a functional interface of binaryoperators in the JDK. This is exactly the type of parameter that the reduce method accepts. Let's run the reduce method and see if it correctly selects the first of the two longest names.


A longest name: Brian

As the reduce() method iterates over the collection, it first invokes the lambda expression on the first two elements of the collection, and the result of the call continues for the next call. On the second call, the value of name1 is bound to the result of the previous call, and the value of name2 is the third element of the collection. The remaining elements are called in this order. The result of the last lambda expression call is the result returned by the entire reduce() method.

The reduce() method returns an Optional value because the collection passed to it might be empty. In that case, there is no longer any longest name. If the list has only one element, the reduce method returns that element directly, without calling the lambda expression.

From this example we can infer that the result of reduce can be at most one element in the collection. If we want to return a default or base value, we can use a variation of the reduce() method that accepts an extra parameter. For example, if the shortest name is Steve, we can pass it to the reduce() method, like this:


final String steveOrLonger = friends.stream()
     .reduce("Steve", (name1, name2) ->
            name1.length() >= name2.length() ? name1 : name2);

If a name is longer than that, that name will be selected; Otherwise it returns this base value Steve. This version of the reduce() method does not return an Optional object because it returns a default value if the collection is empty. Don't worry about not having a return value.

Before we close this chapter, let's take a look at one of the more basic but not so easy operations in collections: merging elements.

Merge elements

We've learned how to find elements, traverse them, and transform collections. But there's another common operation -- splicing collection elements -- and without the new join() function, the clean and elegant code would be dead. This simple method is so practical that it has become one of the most commonly used functions in the JDK. Let's see how it can be used to print elements in a list, separated by commas.

Let's stick with the friends list. What would it take to print out all the names separated by commas using the old method in the JDK library?

We have to go through the list and print the elements one by one. The for loop in Java 5 is an improvement over the previous one, so let's use it.


for(String name : friends) {
      System.out.print(name + ", ");
}
System.out.println();

The code is simple, so let's see what the output is.

Brian, Nate, Neal, Raju, Sara, Scott,

Damn, there was a nasty comma at the end. . How do you get Java not to put a comma here? Unfortunately, the loop is executed step by step, and it's not easy to make it special at the end. To solve this problem, we can go back to the old loop.


for(int i = 0; i < friends.size() - 1; i++) {
      System.out.print(friends.get(i) + ", ");
}
if(friends.size() > 0)
      System.out.println(friends.get(friends.size() - 1));

Let's see if the output of this version is OK.

Brian, Nate, Neal, Raju, Sara, Scott

The results are good, but the code isn't. Save us, Java.

We don't have to endure this pain anymore. The StringJoiner class in Java 8 takes care of these problems. Not only that, the String class also adds a join method so that we can replace the above mess with a single line of code.


System.out.println(String.join(", ", friends));

Take a look, and the results are as good as the code.
The same code at the page code block index 8

The results are good, but the code isn't. Save us, Java.

We don't have to endure this pain anymore. The StringJoiner class in Java 8 takes care of these problems. Not only that, the String class also adds a join method so that we can replace the above mess with a single line of code.
The same code at the page code block index 9
Take a look, and the results are as good as the code.
The same code at the page code block index 8

In the underlying implementation, the string.join () method calls the StringJoiner class to join the value passed in by the second argument (which is a variable-length argument) into a long String, with the first argument as the separator. This method is of course more than just splicing commas. For example, we can pass in a bunch of paths and easily spell out a classpath, thanks to these new methods and classes.

Now that we know how to join list elements, we can transform the elements before joining the list, and of course we know how to use the map method to transform the list. We can then use the filter() method to filter out the elements we want. The join list element in the last step, with a comma or some sort of delimiter, is just a simple reduce operation.

We can concatenate the elements into a string using the reduce() method, but this takes a bit of work. The JDK has a handy collect() method, also a variation of reduce(), that you can use to combine elements into a desired value.

The collect() method performs the reduction operation, but it delegates the specific operation to a collector for execution. We can combine the transformed elements into an ArrayList. Continuing with that example, we can splice the transformed elements into a comma-separated string.


System.out.println(
      friends.stream()
          .map(String::toUpperCase)
          .collect(joining(", ")));

We called the collect() method on the transformed list, passing it an collector returned by an joining() method, a static method in the Collectors tool class. A collector is like a receiver. It receives the objects it passes in and stores them in the format you want: ArrayList, String, etc. We will explore this method further in the collect method and the Collectors class on page 52.

This is the name of the output, now they are capitalized and separated by commas.


BRIAN, NATE, NEAL, RAJU, SARA, SCOTT

conclusion

Collections are common in programming, and with lambda expressions, collection operations in Java are much easier. The old code of lazy collection operations can be replaced by this elegant and clean new way. The inner iterator makes it easier to iterate over collections, to transform them, to remove the burden of variability, and to find collection elements. You can write less code using these new methods. This makes the code easier to maintain, more focused on business logic, and fewer of those basic operations in programming.

In the next chapter, we'll see how lambda expressions simplify another basic operation in program development: string manipulation and object comparison.


Related articles: