Python Pandas pandas. read_sql Function Instance Usage
- 2021-11-13 02:17:22
- OfStack
Pandas is a tool based on NumPy that was created to solve data analysis tasks. Pandas incorporates a large number of libraries and a number of standard data models to provide the tools needed to efficiently manipulate large data sets. Pandas provides a large number of functions and methods that enable us to process data quickly and conveniently. You will soon find that it is one of the important factors that make Python a powerful and efficient data analysis environment. This paper mainly introduces the use of read_sql method in Pandas under 1.
pandas.read_sql ( sql , con , index_col = None , coerce_float = True , params = None , parse_dates = None , columns = None , chunksize = None)
Read SQL queries or database tables into DataFrame.
This feature is a convenient wrapper for read_sql_table and read_sql_query (for backward compatibility). It will delegate to specific functions based on the provided input. The SQL query will be routed to read_sql_query, and the database table name will be routed to read_sql_table. Note that the delegate's functionality may have more specific descriptions about its functionality, which are not listed here.
参数: |
sql:string或SQLAlchemy可选(选择或文本对象) 要执行的SQL查询或表名。 con:SQLAlchemy可连接(引擎/连接)或数据库字符串URI 或DBAPI2连接(回退模式) 使用SQLAlchemy可以使用该库支持的任何数据库。如果是DBAPI2对象, 则仅支持sqlite3。 index_col:字符串或字符串列表,可选,默认值:无 要设置为索引的列(MultiIndex)。 coerce_float:boolean,默认为True 尝试将非字符串,非数字对象(如decimal.Decimal)的值转换为浮点, 这对SQL结果集很有用。 params:list,tuple或dict,optional,default:None 要传递给执行方法的参数列表。用于传递参数的语法取决于数据库驱动程序。 检查数据库驱动程序文档, 了解PEP 249的paramstyle中描述的5种语法样式中的哪1种。 例如,对于psycopg2,使用%(name)s,所以使用params = {'name':'value'} parse_dates:list或dict,默认值:None 要解析为日期的列名的列表。 的字典,其中格式字符串是在解析的情况下的strftime兼容的字符串倍, 或是在解析整数时间戳的情况下(d,S,NS,MS,我们)之1。{column_name: format string} dict of ,其中arg dict对应于关键字参数,特别适用于没有本机Datetime支持的数据库, 例如SQLite。{column_name: arg dict}pandas.to_datetime() columns:list,默认值:None 从SQL表中选择的列名列表(仅在读取表时使用)。 chunksize:int,默认无 如果指定,则返回1个迭代器,其中chunksize是要包含在每个块中的行数。 |
返回: |
DataFrame(数据帧) |
For example
import MySQLdb
mysql_cn= MySQLdb.connect(host='myhost',
port=3306,user='myusername', passwd='mypassword',
db='information_schema')
df_mysql = pd.read_sql('select * from VIEWS;', con=mysql_cn)
print 'loaded dataframe from MySQL. records:', len(df_mysql)
mysql_cn.close()
Content extension:
Is there an example of how to use the SQL query in Pandas to pass parameters?
In particular, I am using the SQLAlchemy engine to connect to the PostgreSQL database. So far, I have found the following jobs:
df = psql.read_sql(('select "Timestamp","Value" from "MyTable" '
'where "Timestamp" BETWEEN %s AND %s'),
db,params=[datetime(2014,6,24,16,0),datetime(2014,6,24,17,0)],
index_col=['Timestamp'])
The pandas documentation says that params can also be passed as an dict, but I can't seem to get this work to try:
df = psql.read_sql(('select "Timestamp","Value" from "MyTable" '
'where "Timestamp" BETWEEN :dstart AND :dfinish'),
db,params={"dstart":datetime(2014,6,24,16,0),"dfinish":datetime(2014,6,24,17,0)},
index_col=['Timestamp'])