Tag Archives: python

Python program exit: OS_ Exit() and sys.exit ()

overview
The

Python program has two exits: os._exit() and sys.exit(). I looked up the difference between the two approaches.

os._exit() terminates the python program directly, and none of the code after that executes.

sys.exit() throws an exception: SystemExit, and if the exception is not caught, the python interpreter exits. If there is code to catch the exception, it will still execute.


for example

import os

try:
    os._exit(0)
except:
    print('Program is dead.')

this print does not print because no exception is caught.

import sys

try:
    sys.exit(0)
except:
    print('Program is dead.')
finally:
    print('clean-up')

Both print here because sys.exit() throws an exception.


Conclusion

exits the program gracefully by using sys.exit(), which raises a SystemExit exception, which we can then catch and do some cleanup. Os._exit () simply exits the Python interpreter, and none of the following statements are executed.

, in general, use sys.exit(); Os._exit () can be used in the child process produced by os.fork().

reference:

[1] https://docs.python.org/3.5/library/exceptions.html

[2] http://www.cnblogs.com/gaott/archive/2013/04/12/3016355.html

Python SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3:

today Python digital image processing installed anaconda, using its default editor spyder. But an error occurred while running a simple program like the one shown below. The error is:

SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated \UXXXXXXXX escape

where the simple test code is:

 

# -*- coding: utf-8 -*-
"""
Created on Tue Oct 24 21:31:25 2017

@author: harchi
"""
from skimage import io
img=io.imread('C:\Users\harchi\Desktop\图像处理\skeleton.bmp')
io.imshow(img)

the cause of the error is: imread(‘C:\Users\harchi\Desktop\ skeleton. BMP ‘) the “\” in the line represents an escape in Python.

the solution, of course, is not to let “\” stand for escape. So you can:

1, prefix the string with r or r, i.e. imread(r’C:\Users\harchi\Desktop\ skeleton. BMP ‘) where r or r denotes an unescaped string in python.

2, before “\” with “\” to achieve escape. Namely: imread (‘ C: \ \ Users \ \ harchi \ \ Desktop \ \ \ \ skeleton image processing BMP ‘) </ span> </ p>

</ span> 3, “\” change into “/”, namely: the imread (‘ C:/Users \ harchi/Desktop/image processing/skeleton BMP ‘) </ span> </ p>

</ span> finally: add python string prefix knowledge:

1, prefix the string with r or r to indicate that the string is an unescaped original string.

2, prefix a string with u or u to indicate that the string is unicode.

In Python, print() prints to remove line breaks

python print() prints, the default is a newline.
for example:

print('abc')
pirnt('xyz')

gives us

abc
xyz

what do I do if I want to get , abcxyz, which is to print two lines and put them in one line.
can be used with the code:

print('abc',end='')
print('xyz')

so that the printed result does not have a line break:
abcxyz

before I saw someone on the Internet write this:

print('abc'),
print('xyz')

is comma separated, and this python3 is no longer valid. Python2 works fine. But there’s a space in between, and in python2 the printed result is: ABC xyz

[Python] pandas Library pd.to_ Parameter arrangement and example of Excel operation writing into excel file

excel writes to pd. dataframe.to_excel (); Write DataFrame to an Excel sheet.

to_excel(self, excel_writer, sheet_name='Sheet1', na_rep='', float_format=None,columns=None, 
header=True, index=True, index_label=None,startrow=0, startcol=0, engine=None, 
merge_cells=True, encoding=None,inf_rep='inf', verbose=True, freeze_panes=None)

common parameter resolution

  • excel_writer: ExcelWriter target path
In [16]: df = pd.read_csv('test.csv')

In [17]: df
Out[17]:
   index  a_name  b_name
0      0       1       3
1      1       2       3
2      2       3       4
#excel_writer :'excel_output.xls'输出路径
In [18]: df.to_excel('excel_output.xls')

Sheet_name: excel sheet name

#得到的表名就是'biubiu'
In [20]: df.to_excel('excel_output.xls',sheet_name='biubiu')
  • na_rep: missing value filled, can be set to the string
In [25]: df = pd.read_excel('excel_output.xls')

In [26]: df
Out[26]:
   index  a_name  b_name
0      0       1     3.0
1      1       2     3.0
2      2       3     NaN
#如果na_rep设置为bool值,则写入excel时改为01;也可以写入字符串或数字
In [27]: df.to_excel('excel_output.xls',na_rep=True)

In [28]: pd.read_excel('excel_output.xls')
Out[28]:
   index  a_name  b_name
0      0       1       3
1      1       2       3
2      2       3       1

In [29]: df.to_excel('excel_output.xls',na_rep=False)

In [30]: pd.read_excel('excel_output.xls')
Out[30]:
   index  a_name  b_name
0      0       1       3
1      1       2       3
2      2       3       0

In [31]: df.to_excel('excel_output.xls',na_rep=11)

In [32]: pd.read_excel('excel_output.xls')
Out[32]:
   index  a_name  b_name
0      0       1       3
1      1       2       3
2      2       3      11
  • columns: select the output columns to be stored.
In [44]: df.to_excel('excel_output.xls',na_rep=11,columns=['index'])

In [45]: pd.read_excel('excel_output.xls')
Out[45]:
   index
0      0
1      1
2      2
  • header: specify the row as the column name, default 0, that is, take the first row, the data is the data below the column name row; If the data does not contain column names, set header = None;
In [48]: df.to_excel('excel_output.xls',na_rep=11,index=False)

In [49]: pd.read_excel('excel_output.xls')
Out[49]:
   index  a_name  b_name
0      0       1       3
1      1       2       3
2      2       3      11

In [50]: df.to_excel('excel_output.xls',na_rep=11,index=False,header=None)

In [51]: pd.read_excel('excel_output.xls')
Out[51]:
   0  1   3
0  1  2   3
1  2  3  11
  • index: defaults to True and displays index. When index=False, row index (name) is not displayed
  • index_label: sets the column name of index column

Python — a solution to [error 24: too many open files] under Ubuntu


in my last blog, I mentioned that when I used multithreading + coroutine crawler to grab data, when my coroutine number × thread number was very large, it would prompt [Error 24: too many open files] and a series of other errors. This blog is a solution to this problem.


Why does

report this error?You only have a few files open?
this should be the first reaction of most people. When I met this error was also very meng, then wanted to think, it should be me every collaborators process to obtain access to the file handle, so even though you only took up to save data to a file, but when the program runs, there may be many threads or processes have the file handle, the program looks like opened a lot of files.
so the only way to deal with this situation is to change the system default maximum number of files.


Under

ubuntu, you can enter ulimit-n in the terminal to see the current system default maximum number of files, which is usually 1024 by default. Obviously, this value is small for the crawler.
there are two ways to change this default value. Both methods have worked on different machines. The first is simple, but it doesn’t guarantee success. If the first fails, try the second, and you’ll basically solve the problem.


the first method:
directly in the terminal input: ulimit -n 10000
I tried this on my own computer without success, only on the server.


the second method:
input in terminal sudo vim/etc/security/limits the conf to open the file, to the end of the file, the keyboard I insert content:

* soft nofile 10000
* hard nofile 10000

after the above two lines have been written, press Esc to exit edit mode, then press shift and at the same time; Key, type wq! save the file and exit. Then log out and log in again. If you put ulimit -n in the terminal you will get the value you just set.
Windows has not tried, recently rarely logged in Windows, Windows do not know whether there will be such a problem, if encountered, the principle should be the same, according to this idea to find the answer should be able to solve.


above. Welcome to exchange.

Attributeerror: ‘dataframe’ object has no attribute ‘IX’ error

“AttributeError: ‘DataFrame’ object has no attribute ‘ix'”

recently reported when using the ix method of DataFrame

after searching on the Internet, is removed from the series.ix and dataframe.ix method at the beginning of pandas’ 1.0.0 version.

my solution: use the loc method of DataFrame or the iloc method instead.

check pandas

for details


reference: https://hacpai.com/article/1581255121678

Time, strftime and strptime in Python

The most common

time.time() returns a floating point number in seconds. But the type that strfTime handles is time.struct_time, which is actually a tuple. Both strpTime and localTime will return this type.

>>> import time
>>> t = time.time()
>>> t
1202872416.4920001
>>> type(t)
<type 'float'>
>>> t = time.localtime()
>>> t
(2008, 2, 13, 10, 56, 44, 2, 44, 0)
>>> type(t)
<type 'time.struct_time'>
>>> time.strftime('%Y-%m-%d', t)
'2008-02-13'
>>> time.strptime('2008-02-14', '%Y-%m-%d')
(2008, 2, 14, 0, 0, 0, 3, 45, -1)

1, strftime usage

strftime can be used to get the current time, you can format the time as a string, and so on, which is pretty handy. However, it should be noted that the acquired time is the time of the server, and pay attention to the time zone issues, such as GAE lying that the time is the 0 time zone of GMT, which needs to be converted by itself.

Strftime ()

strftime ()
we can use the strftime () function to format the time in the desired format

#!/usr/bin/python
import time

t = (2009, 2, 17, 17, 3, 38, 1, 48, 0)
t = time.mktime(t)
print time.strftime("%b %d %Y %H:%M:%S", time.gmtime(t))

#输出:Feb 17 2009 09:03:38

2. Strptime

The

Python time strptime() function parses a time string into a time tuple according to the specified format.
python time date formatting symbol:

  • %y two-digit years are (00-99)
  • %y four-digit years are (000-9999)
  • %m months (01-12)
  • %d months (0-31)
  • %H 24 H hours (0-23)
  • 0

  • 1% I 12-hour hours (01-12)
  • 3% m minutes (00=59)

    4

  • 5% S Second (00-59)
  • %a local simplified name of the week
  • %a local simplified name of the week
  • %b local simplified name of the month
  • %b local complete name of the month
  • %c local corresponding date and time expression
  • 0

  • 1% a day (001-366)
  • 2
  • 3% p local a.m. The number of weeks of the year (00-53) Sunday is the beginning of the week
  • %w (0-6), Sunday is the beginning of the week
  • %w (00-53) Monday is the beginning of the week
  • %x local corresponding date is
  • %x local corresponding time is
  • 0

  • 1 %Z current time zone name
  • 2

  • 3 %% %% %U number itself
  • 4

5

instance:

#!/usr/bin/python
import time

struct_time = time.strptime("30 Nov 00", "%d %b %y")
print "returned tuple: %s " % struct_time

#输出:returned tuple: (2000, 11, 30, 0, 0, 0, 3, 335, -1)

Dataframe and np.array The mutual transformation of

the Internet looking for half a day not dataframe into transformation dataframe array is an array, so here to summary the mutual conversion of python generation is as follows:

  • dataframe into array
    df=df.values
    • array into the dataframe
      import pandas as pd
      
      df = pd.DataFrame(df)

      and that’s OK!

Several calculation methods of Python execution time

let me start by saying a few things about the pits I ran into, the production problems I ran into, the fact that I was scheduling Python scripts to execute and monitoring the process, and that Python scripts took much longer to execute than the Python scripts themselves.
monitor python script execution time is 36 hours , while python script statistics their execution time time is 4 hours or so.
problem after the first thought was that there was a problem with Linux, looking for various logs did not find any exceptions.
is then thought of in python as py2neo writing data asynchronously, blocking the execution of the process. Finally, the problem was identified: the python script USES time.clock(), which counts the CPU execution time, not the program execution time. Next, compare several python time statistics:

method 1:

import datetime
starttime = datetime.datetime.now()
#long running
#do something other
endtime = datetime.datetime.now()
print (endtime - starttime).seconds

datetime. Datetime. Now () gets the current date, which is execution time after the execution of the program.

method 2:

start = time.time()
#long running
#do something other
end = time.time()
print end-start

time.time() gets the current time, in seconds, since epoch. If the system clock provides them, fractions of seconds may exist. So this is going to return a floating point type. This is also the program execution time .

method 3:

start = time.clock()
#long running
#do something other
end = time.clock()
print end-start

time.clock() returns the CPU time since the program started or the first time it was called clock(). This has as much precision as the system records. It also returns a floating point type. What you get here is CPU execution time . Note: program execution time = CPU time + IO time + sleep or wait time

The method of getting shell command output in Python

python to get shell command output :

1.

import subprocess

output = subprocess.Popen(['ls','-l'],stdout=subprocess.PIPE,shell=True).commun
icate()
print output[0]

2。

import commands

return_code, output = commands.getstatusoutput('ls -l')

3。

import os

process = os.popen('ls -l') # return file
output = process.read()
process.close()

the original address: http://www.cnblogs.com/snow-backup/p/5035792.html

How to import Python from relative path

For example, we have a file with the following structure:

pkg/
  __init__.py
  libs/
    some_lib.py
    __init__.py
  components/
    code.py
    __init__.py

if we want to call in code.py libs/some_lib.py module, such as using relative call: from.. Libs.some_lib import something, it is not enough simply to add /___ to the package. Py . Python returns the error ValueError: first import in non-package. So how do you solve this problem?

has the following solution:

Add the current path to sys.path

considering that compontent and libs are folders at the same level, we can add the following code directly in code.py to add the parent folder of the current folder to the system path.

import sys
from os import path
sys.path.append( path.dirname( path.dirname( path.abspath(__file__) ) ) )

or the following (this is true for any relational folder, as long as we give the absolute path to the folder in lib_path) :

import os, sys
lib_path = os.path.abspath(os.path.join('..'))
sys.path.append(lib_path)

so we can import something with from libs.some_lib import something.

executes the code in package mode:

python -m pkg.components.code

and then we can use from.. Libs.some_lib import something to import.

note that you don’t need .py to end the file.

summary

we can actually combine these two approaches:

if __name__ == '__main__':
    if __package__ is None:
        import sys
        from os import path
        sys.path.append( <path to the package> )
       from libs.some_lib import something
    else:
        from ..libs.some_lib import something