Skip to content Skip to sidebar Skip to footer

Retrieving A Date From A Complex String In Python

I'm trying to get a single datetime out of two strings using datetime.strptime. The time is pretty easy (ex. 8:53PM), so I can do something like: theTime = datetime.strptime(give

Solution 1:

Think that if you would like to simple skip time from the URL you can use split for example the following way:

givenURL = 'http://site.com/?year=2011&month=10&day=5&hour=11'pattern = "http://site.com/?year=%Y&month=%m&day=%d"theDate = datetime.strptime(givenURL.split('&hour=')[0], pattern)

So not sure that understood you correctly, but:

givenURL = 'http://site.com/?year=2011&month=10&day=5&hour=11'datePattern = "http://site.com/?year=%Y&month=%m&day=%d"timePattern = "&time=%I:%M%p"theDateTime = datetime.strptime(givenURL.split('&hour=')[0] + '&time=' givenTime, datePattern + timePattern)

Solution 2:

import datetime
import re

givenURL  = 'http://site.com/?year=2011&month=10&day=5&hour=11'
givenTime = '08:53PM'print' givenURL == ' + givenURL
print'givenTime == ' + givenTime

regx = re.compile('year=(\d\d\d\d)&month=(\d\d?)&day=(\d\d?)&hour=\d\d?')
print'\nmap(int,regx.search(givenURL).groups()) ==',map(int,regx.search(givenURL).groups())

theDate = datetime.date(*map(int,regx.search(givenURL).groups()))
theTime = datetime.datetime.strptime(givenTime, "%I:%M%p")

print'\ntheDate ==',theDate,type(theDate)
print'\ntheTime ==',theTime,type(theTime)


theDateTime = theTime.replace(theDate.year,theDate.month,theDate.day)
print'\ntheDateTime ==',theDateTime,type(theDateTime)

result

 givenURL == http://site.com/?year=2011&month=10&day=5&hour=11
givenTime == 08:53PM

map(int,regx.search(givenURL).groups()) == [2011, 10, 5]

theDate == 2011-10-05 <type'datetime.date'>

theTime == 1900-01-0120:53:00 <type'datetime.datetime'>

theDateTime == 2011-10-0520:53:00 <type'datetime.datetime'>

Edit 1

As strptime() is slow, I improved my code to eliminate it

from datetime import datetime
import re
from time import clock


n = 10000

givenURL  = 'http://site.com/?year=2011&month=10&day=5&hour=11'
givenTime = '08:53AM'# eyquem
regx = re.compile('year=(\d\d\d\d)&month=(\d\d?)&day=(\d\d?)&hour=\d\d? (\d\d?):(\d\d?)(PM|pm)?')
t0 = clock()
for i in xrange(n):
    given = givenURL + ' ' + givenTime
    mat = regx.search(given)
    grps = map(int,mat.group(1,2,3,4,5))
    if mat.group(6):
        grps[3] += 12# when it is PM/pm, the hour must be augmented with 12
    theDateTime1 = datetime(*grps)
print clock()-t0,"seconds   eyquem's code"print theDateTime1


print# Artsiom Rudzenka
dateandtimePattern = "http://site.com/?year=%Y&month=%m&day=%d&time=%I:%M%p"
t0 = clock()
for i in xrange(n):
    theDateTime2 = datetime.strptime(givenURL.split('&hour=')[0] + '&time=' + givenTime, dateandtimePattern)
print clock()-t0,"seconds   Artsiom's code"print theDateTime2

printprint theDateTime1 == theDateTime2

result

0.460598763251 seconds   eyquem's code2011-10-0508:53:002.10386180366 seconds   Artsiom's code2011-10-0508:53:00True

My code is 4.5 times faster. That may be interesting if there are a lot of such transformations to perform

Solution 3:

There's no way to do that with the format string. However, if the hour doesn't matter, you can get it from the URL as in your first example and then call theDateTime.replace(hour=hour_from_a_different_source).

That way you don't have to do any additional parsing.

Post a Comment for "Retrieving A Date From A Complex String In Python"