java implements writing data to the map type field of hive

  • 2021-11-02 01:01:05
  • OfStack

Write data to map type field of hive

The field type of the table is map < string,string >

The type of this attribute of the corresponding class needs to be defined as String, not Map < String,String > ! !

Method 1:

The table building statement defines the separator of map:


row format delimited
  fields terminated by '|'
  collection items terminated by ','
  map keys terminated by ':'
  NULL DEFINED AS ''

Then, after encapsulating map in java, you can't directly write the string of map. toString () into the field (there will be "=", which can't correctly form the content of JSON format), and you can't serialize it into the string of JSON format and write it into the field (there will be a lot of "\")! You need to define your own methods for toString:


    public static String insertToMap(Map<String, String> map) {
        StringBuilder sb = new StringBuilder();
        Set<String> set = map.keySet();
        for (String s : set) {
            sb.append(s).append(":").append(StringUtils.isBlank(map.get(s)) ? "NULL" : map.get(s)).append(",");
        }
        String str = sb.toString();
        return str.substring(0, str.length() - 1);
    }

A string is a string without double quotation marks and curly braces, so when inserted into a field, hive automatically adds double quotation marks for both key and value, and curly braces at both ends! (Why do you need to set value to NULL when it is empty? If it is empty and does not write, it may be processed into NULL with four double quotation marks, so manually specify the empty as "NULL" string)

Method 2:

The table building statement does not define the separator of map:

Then, after encapsulating map in java, the string of map. toString () cannot be written into the field directly, nor can it be serialized into the string of JSON format and written into the field! You need to define your own methods for toString:


    public static String insertToMap(Map<String, String> map) {
        StringBuilder sb = new StringBuilder();
        Set<String> set = map.keySet();
        for (String s : set) {
            sb.append(s).append("\003").append(StringUtils.isBlank(map.get(s)) ? "NULL" : map.get(s)).append("\002");
        }
        String str = sb.toString();
        return str.substring(0, str.length() - 1);
    }

What you get is the correct field content!

In hive, the default is to separate key from value with "\ 003", and separate two key-value pairs with "\ 002"!

The above is the conclusion after trying several methods today!

Definition and Insertion of hive-map Type Fields

The map type defines an kv structure, which is often used in hive.

How do I define an map type?


create table employee(id string, perf map<string, int>)     
ROW FORMAT DELIMITED                                        
FIELDS TERMINATED BY '\t'                              
COLLECTION ITEMS TERMINATED BY ','                     
MAP KEYS TERMINATED BY ':';    

Where fields is the field separator, collection is the separator for each kv pair, and map keys is the separator for k and v.

When importing data, you only need to process the data according to the corresponding separator.


Related articles: