python How to Read. mtx Files

  • 2021-11-01 03:59:28
  • OfStack

The mtx file is matrix data stored in a sparse matrix format and can be read by following these steps:

1. Install the scanpy package


pip install scanpy

2. File reading


import scanpy as sc 
adata = sc.read(filename)
data = adata.X

After line 1 read, annData is returned, and line 2 obtains matrix data through. X operation

3. Convert to dense matrix


data = data.todense()

The matrix obtained directly is sparse and can be transformed into dense matrix by todense function

Supplement: python reads various files

Json:


use_time=[]
with open(address,'r') as f: #ubuntu
    mobile = json.load(f)
    calls = mobile["transactions"][0]["calls"]
for call in calls: 
  use_time.append(str(call['use_time']))

Excel:


rawdata1=open_workbook(address)
rawdata=rawdata1.sheet_by_index(0)
for i in range(1,rawdata.nrows):
    if rawdata.cell(i,date_index).value=="": # Skip blank lines 
        continue
    else:
        if ctype==3:  # If yes 3 , then use datetime Module processing date 
            date1=rawdata.cell(i,date_index).value
            date2 = xldate_as_tuple(date1,0) 
            date3=datetime(*date2)
            if "." in str(rawdata.cell(i,phone_index).value):
                phone1=str(rawdata.cell(i,phone_index).value)[:-2]  
            else:
                phone1=str(rawdata.cell(i,phone_index).value)

Write EXCEL:


Excel_file = xlwt.Workbook() 
sheet = Excel_file.add_sheet('sheet0')
header=[u' Number ',' Date top1',' Date top2',' Date top3']
# Write header row: 
for i in range(len(header)):
    sheet.write(0,i,header[i])
# Start writing data by row: 
for i in range(len(phonelist)):
    sheet.write(i+1,0,phonelist[i])
    sheet.write(i+1,1,dic[str(phonelist[i])])
# Save EXCEL : 
Excel_file.save("C:/Users/Desktop/100 File output xls/"+str(fileName)+".xls")

CSV:


rawdata=pd.read_csv(address,skip_blank_lines=True) # Parameter is to remove blank lines 
if 'start_time' or 'begin_time'  in rawdata.columns:
    if 'start_time' in rawdata.columns:
        start_time=rawdata['start_time']
    elif 'begin_time' in rawdata.columns:
            start_time=rawdata['begin_time']

txt:


rawdata=open(address,'r')
i=0
a=[] #c Deposit number 1 Column name of row 
for line in rawdata:
    if i==1: # Default number 2 Row starts to store call data 
        a=line.split(',') # Comma as separator 
        for j in range(len(a)): # Finds the column subscript of the specified column name 
            if (('-' in str(a[j]))or('/' in str(a[j]))): # Determine the number of columns in which the date is located 
                date_index=j # Column subscript of save date 
            elif  str(a[j]).isdigit() and len(str(a[j]))>5: # The default string consisting of all numbers is the phone number 
                phone_index=j
            else:
                pass
        break
    else:
        i+=1
i=0
for line in rawdata:# Begin to dump data: 
    if len(line)<10: # Skip blank lines 
        continue
    data_line=line.split(',') #txt Default to ',' Separated data 
    if i==0:
        pass # No. 1 1 Behavior column name, skipping 
        i+=1
    else: # From the first 2 Row starts to save data 
        start_time.append(data_line[date_index])

Related articles: