When manipulating strings, if you feel like you’re writing something complicated, try the String module, which has a lot of useful properties.
>>> import string
>>> dir(string)
['Formatter', 'Template', '_ChainMap', '_TemplateMetaclass', '__all__', '__built
ins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__packag
e__', '__spec__', '_re', '_string', 'ascii_letters', 'ascii_lowercase', 'ascii_u
ppercase', 'capwords', 'digits', 'hexdigits', 'octdigits', 'printable', 'punctua
tion', 'whitespace']
>>> string.ascii_lowercase #All lowercase letters
'abcdefghijklmnopqrstuvwxyz'
>>> string.ascii_uppercase #All upper case letters
'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
>>> string.hexdigits #All hexadecimal characters
'0123456789abcdefABCDEF'
>>> string.whitespace #All blank characters
' \t\n\r\x0b\x0c'
>>> string.punctuation #All punctuation characters
'!"#$%&\'()*+,-./:;<=>[email protected][\\]^_`{|}~'
The problem
Count the number of occurrences of all words in a file or a string. Because of the punctuation in a sentence, cutting a string directly involves cutting words and punctuation together, such as:
If the specified punctuation is cut, it can be quite troublesome to operate when the sentence is long or there are many punctuation marks in it.
The solution
Idea: First, replace the punctuation marks in the sentence with Spaces, and then cut them in split(). So you can use String.punctuation at this point
The code:
import string #Be sure to import the string module before using it.
>>> s="We met at the wrong time, but separated at the right time. The most urgen
t is to take the most beautiful scenery!!! the deepest wound was the most real e
motions."
>>> for i in s:
... if i in string.punctuation: #If the character is punctuation, replace it with a space.
... s = s.replace(i," ")
...
>>> s
'We met at the wrong time but separated at the right time The most urgent is t
o take the most beautiful scenery the deepest wound was the most real emotion
s '
>>> s.split()#Cut to Blank
['We', 'met', 'at', 'the', 'wrong', 'time', 'but', 'separated', 'at', 'the', 'ri
ght', 'time', 'The', 'most', 'urgent', 'is', 'to', 'take', 'the', 'most', 'beaut
iful', 'scenery', 'the', 'deepest', 'wound', 'was', 'the', 'most', 'real', 'emot
ions']
>>>
Of course, this problem can also be solved with regularization:
>>> import re
>>> s="We met at the wrong time, but separated at the right time. The most urgen
t is to take the most beautiful scenery!!! the deepest wound was the most real e
motions."
>>> re.findall(r'\b\w+\b',s)
['We', 'met', 'at', 'the', 'wrong', 'time', 'but', 'separated', 'at', 'the', 'ri
ght', 'time', 'The', 'most', 'urgent', 'is', 'to', 'take', 'the', 'most', 'beaut
iful', 'scenery', 'the', 'deepest', 'wound', 'was', 'the', 'most', 'real', 'emot
ions']
There are many ways to solve a problem, you can try several more to exercise your thinking. When manipulating strings, remember the String module if you feel it is too cumbersome to write, and see if you can solve the problem more easily.
Read More:
- Python algorithm for “anagram” judgment problem
- Python Time Module timestamp, Time string formatting and Conversion (13-bit timestamp)
- Python_Syntax error: unexpected character after line continuation character
- Python time tuples are converted to timestamps, strings
- Python ProgressBar adds its own dynamic information to the progress bar
- Typeerror in Python regular expression: expected string or bytes like object
- Full explanation of SYS module of Python
- Python ValueError: only 2 non-keyword arguments accepted
- Python traverses all files under the specified path and retrieves them according to the time interval
- Python raspberry pie starts sending IP address to mailbox
- Python: How to Set Line breaks and tabs for Strings
- Python: LeetCode 43 Multiply Strings
- Extracting Data from XML (Using Python to Access Web Data)
- Python error: urllib.error.HTTPError : http Error 404: not found
- How to Solve Python AttributeError: ‘module’ object has no attribute ‘xxx’
- Python USES the PO design pattern for automated testing
- An introduction to sys modules in Python and how packages are imported and used
- Python: How to Create an Automatic Recording Program
- Python Pandas Typeerror: invalid type comparison
- The Python DOM method iterates over all the XML in a folder