0%

Python Challenge (Level 20)

第20关

这一关要用到httplib.很遗憾我对于HTTP以及httplib的了解几乎为零,只能找下网上的解题方法。这个外国人写的解题过程非常详细。

看了解题过程后,总结一下主要有三个步骤:

  • 从小到大修改请求的headers中的content-range,得到提示:invader
  • 从大到小修改请求的headers中的content-range,得到密码redavni(即invader的reverse)。同时,还得到一个zip文件
  • 用密码将zip文件解压,即可通关

好吧,也许看完上面三个步骤你还是不知所云。下面一步一步详细解释一下。

用到的库

1
2
3
4
import httplib
import base64
from pprint import pprint
import re

第一步

首先,定义一个get_range方法。给定range,返回相应的response.

1
2
3
4
5
6
def get_range(page, base, limit):
conn = httplib.HTTPConnection('www.pythonchallenge.com')
headers = {'Authorization': 'Basic ' + base64.b64encode('butter:fly'),
'Range': 'bytes=%s-%s' % (base, limit)}
conn.request('GET', page, '', headers)
return conn.getresponse()

利用这个方法,我们可以看看headers.注意看content-range.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
In [81]: r.getheaders()
Out[81]:
[('x-powered-by', 'PHP/5.3.3-7+squeeze17'),
('transfer-encoding', 'chunked'),
('server', 'lighttpd/1.4.28'),
('content-range', 'bytes 0-30202/2123456789'),
('date', 'Tue, 03 Feb 2015 09:37:11 GMT'),
('content-type', 'image/jpeg')]

In [82]: r = get_range('/pc/hex/unreal.jpg', 30203, '')

In [83]: r.getheaders()
Out[83]:
[('x-powered-by', 'PHP/5.3.3-7+squeeze17'),
('content-transfer-encoding', 'binary'),
('server', 'lighttpd/1.4.28'),
('transfer-encoding', 'chunked'),
('content-range', 'bytes 30203-30236/2123456789'),
('date', 'Tue, 03 Feb 2015 09:37:33 GMT'),
('content-type', 'application/octet-stream')]

In [84]: r.read()
Out[84]: "Why don't you respect my privacy?\n"

可以看到,用不同的range,得到的response的内容不一样,同时headers里面有下一个range. 这有点像第4关,不断获取下一个网址。不断获取下一个range,再不断moveForward.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
def next_range(base, bases, results):
r = get_range('/pc/hex/unreal.jpg', base, 2123456789)
bases.append(base)
results.append(r.read())
try:
m = re.match('bytes %d-([0-9]+)/2123456789' % base, r.getheader('content-range'))
return int(m.group(1)) + 1
except:
return "ERR"

def moveForward():
bases = []
results = []
b = 30203
while True:
b = next_range(b, bases, results)
if b == "ERR":
break
pprint(results)
pprint(bases)

我们会得到这些信息:

[“Why don’t you respect my privacy?\n”,
‘we can go on in this way for really long time.\n’,
‘stop this!\n’,
‘invader! invader!\n’,
‘ok, invader. you are inside now. \n’,
‘’]

好了,这里invader出现了好几次,我们要留意一下。

第二步

range超过30347后,再增大也没有信息了。我们改成2123456789

1
2
3
4
In [85]: r = get_range('/pc/hex/unreal.jpg', 2123456789, '')

In [86]: r.read()
Out[86]: 'esrever ni emankcin wen ruoy si drowssap eht\n'

这串信息reverse就得到: the password is your new nickname in reverse

到这里就得到了一个密码,redavni(”invader” reverse)

第三步

range为2123456789时,返回的请求里面content-range也有变化。与第一步类似,我们往小改,看看结果是什么。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
In [87]: r.getheaders() 
Out[87]:
[('x-powered-by', 'PHP/5.3.3-7+squeeze17'),
('content-transfer-encoding', 'binary'),
('server', 'lighttpd/1.4.28'),
('transfer-encoding', 'chunked'),
('content-range', 'bytes 2123456744-2123456788/2123456789'),
('date', 'Tue, 03 Feb 2015 09:46:02 GMT'),
('content-type', 'application/octet-stream')]

In [88]: r = get_range('/pc/hex/unreal.jpg', 2123456743, '')

In [89]: r.getheaders()
Out[89]:
[('x-powered-by', 'PHP/5.3.3-7+squeeze17'),
('content-transfer-encoding', 'binary'),
('server', 'lighttpd/1.4.28'),
('transfer-encoding', 'chunked'),
('content-range', 'bytes 2123456712-2123456743/2123456789'),
('date', 'Tue, 03 Feb 2015 09:52:03 GMT'),
('content-type', 'application/octet-stream')]

In [90]: r.read()
Out[90]: 'and it is hiding at 1152983631.\n'

这里提示1152983631里面有东西,我们print出来看看是什么

1
2
3
4
5
6
7
8
9
10
11
In [91]: r = get_range('/pc/hex/unreal.jpg', 1152983631, '')

In [92]: data = r.read()

In [93]: len(data)
Out[93]: 239733

In [94]: data[:10]
Out[94]: 'PK\x03\x04\x14\x00\t\x00\x08\x00'

In [95]:

网上的大神表示根据前面四个bytes就看出它是一个zip file. 后续工作就是data写到zip文件,然后用前面的密码解压文件。这就进入第21关了。21关就是要搞zip文件里的东西,并不是一个网址。

Putting it All Together

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
# -×- coding:utf-8 -*-
__author__ = 'chen'
import httplib
import base64
from pprint import pprint
import re

def get_range(page, base, limit):
conn = httplib.HTTPConnection('www.pythonchallenge.com')
headers = {'Authorization': 'Basic ' + base64.b64encode('butter:fly'),
'Range': 'bytes=%s-%s' % (base, limit)}
conn.request('GET', page, '', headers)
return conn.getresponse()

def next_range(base, bases, results):
r = get_range('/pc/hex/unreal.jpg', base, 2123456789)
bases.append(base)
results.append(r.read())
try:
m = re.match('bytes %d-([0-9]+)/2123456789' % base, r.getheader('content-range'))
return int(m.group(1)) + 1
except:
return "ERR"

def moveForward():
bases = []
results = []
b = 30203
while True:
b = next_range(b, bases, results)
if b == "ERR":
break
pprint(results)
pprint(bases)

def solve():
moveForward()
# 得到重要信息:invader,记住,you're invader

r = get_range('/pc/hex/unreal.jpg', 2123456789, '')
msg = r.read()
print msg
print msg[::-1]
print 'invader'[::-1]
pprint(r.getheaders())
# 提示密码是你的new nickname反转过来,所以就是invader ---> redavni

print get_range('/pc/hex/unreal.jpg', 2123456743, '').read()
data = get_range('/pc/hex/unreal.jpg', 1152983631, '').read()
open('unreal.zip', 'wb').write(data)
# 类似第一步,但是这次从后向前找,GET两次就得到一些数据.
# 将其保存为zip文件,用上一步得到的密码解压

if __name__ == "__main__" : solve()