python reads instances of txt json and hdf5 files

  • 2020-11-03 22:31:04
  • OfStack

python Reads the txt file

The simplest open function:


# -*- coding: utf-8 -*-
with open("test.txt","r",encoding="gbk",errors='ignore') as f:
 print(f.read())

Here, an txt file is read using the open function. "encoding" indicates that the read format is "gbk", and you can also ignore the error encoding.

In addition, it's a good habit to use with statements to manipulate files IO, eliminating the need for close() every time you open them.

2.python Reads the json file

The simple ES24en.json file is as follows:


{
 "glossary": {
 "title": "example glossary",
 "GlossDiv": {
  "title": "S",
  "GlossList": {
  "GlossEntry": {
   "ID": "SGML",
   "SortAs": "SGML",
   "GlossTerm": "Standard Generalized Markup Language",
   "Acronym": "SGML",
   "Abbrev": "ISO 8879:1986",
   "GlossDef": {
   "para": "A meta-markup language, used to create markup languages such as DocBook.",
   "GlossSeeAlso": ["GML", "XML"]
   },
   "GlossSee": "markup"
  }
  }
 }
 }
}

Here, json module of python is needed to handle the analysis:


import json
data = json.load(open('example.json'))
print(type(data))
print(data)

Print as follows:


<class 'dict'>
{'glossary': {'title': 'example glossary', 'GlossDiv': {'title': 'S', 'GlossList': {'GlossEntry': {'ID': 'SGML', 'SortAs': 'SGML', 'GlossTerm': 'Standard Generalized Markup Language', 'Acronym': 'SGML', 'Abbrev': 'ISO 8879:1986', 'GlossDef': {'para': 'A meta-markup language, used to create markup languages such as DocBook.', 'GlossSeeAlso': ['GML', 'XML']}, 'GlossSee': 'markup'}}}}}

The json. load() function returns dict, and the json data now becomes a network of Python dictionaries.

We can then interpret it using standard key retrieval, such as:


print(data['glossary']['GlossDiv']['GlossList'])

The print results are as follows:


{'GlossEntry': {'ID': 'SGML', 'SortAs': 'SGML', 'GlossTerm': 'Standard Generalized Markup Language', 'Acronym': 'SGML', 'Abbrev': 'ISO 8879:1986', 'GlossDef': {'para': 'A meta-markup language, used to create markup languages such as DocBook.', 'GlossSeeAlso': ['GML', 'XML']}, 'GlossSee': 'markup'}}

3.python reads the HFD5 file

HDF5 is a hierarchical format (hierarchical format) that is often used to store complex scientific data. MATLAB, for example, USES this format to store data. This format is useful for storing complex hierarchical data with associated metadata (metadata), such as the results of computer simulations, and so on.

The main concepts related to HDF5 are as follows:

File file: Container for hierarchical data, equivalent to tree roots ('root' for tree)

Group group: 1 node of the tree (node for a tree)

Dataset dataset: Arrays of numeric data that can be very, very large

Attribute attribute: Bits of metadata that provide additional information


# -*- coding: utf-8 -*-
# create hdf5 file 
import datetime
import os
import h5py
import numpy as np
imgData = np.zeros((30,3,128,256))
if not os.path.exists('test.hdf5'):
 with h5py.File('test.hdf5') as f:
 f['data'] = imgData   # The primary key to write the data to the file data The following 
 f['labels'] = range(100) 

Read after creation:


import datetime
import os
import h5py
import numpy as np
with h5py.File('test.hdf5') as f:
 print(f)
 print(f.keys)

In addition to the above methods, pandas provides a function to read h5 files directly:


pd.HDFStore
import datetime
import os
import h5py
import numpy as np
import pandas as pd
data = pd.HDFStore("dataset_log.h5")
print(type(data))

The print result is:


<class 'pandas.io.pytables.HDFStore'>
Closing remaining open files:dataset_log.h5...done

Related articles: