本文共 6603 字,大约阅读时间需要 22 分钟。
天天Python GIL,光嘴上说但是实际并没有真正测试对比过。
今天测试了一下Python的多线程、多进程、单线程的下载图片效率。
实测Python多线程在io密集型的情况下还是比单线程快很多的,引用一下另一位博主解释的原因:
io是分为网络io和磁盘io,一般情况下,io有发送数据(output)和返回数据(input)两个过程。比如以浏览器为主体,浏览器发送请求给服务器(output),服务器再将请求结果返回给浏览器(input)。python在io阻塞的情况下,会释放GIL(global interpreter lock)锁,其他线程会在当前线程等待返回值(阻塞)的情况下继续执行发送请求(output),第三个线程又会在第二个线程等待返回值(阻塞)的情况下发送请求(output),即在同一时间片段,会有一个线程在等待数据,也会有一个线程在发数据。这就减少了io传输的时间。
--------------------- 作者:daijiguo 来源:CSDN 原文:https://blog.csdn.net/daijiguo/article/details/78042309 版权声明:本文为博主原创文章,转载请附上博文链接!
至于多线程和多进程,在下载多个图片时(比如图片数量多于cpu核数),并且每个图片比较小的情况下,多线程看似更快,
感觉是因为图片数大于cpu核数,所以进程和线程都要切换?虽然进程切换的比较少,但是进程的开销更大,而虽然因为GIL只有一个cpu核心工作,但是线程开销比较小,加上下载的资源也比较小,多线程切换的次数也较少,所以多线程更快。
而再在下载少量图片(比如图片数量小于cpu核数)时,并且图片比较大,多进程可以充分利用cpu,不用切换进程,减少了开销,而多线程要不停地切换任务,再加上图片比较大,线程不停的切换增加开销降低效率,导致速度不如多进程。
感觉我的理解就是这样了。。。
哪天再测一下cpu密集型多线程、多进程对比。。。
import requestsimport timefrom threading import Threadimport threadingimport multiprocessingpython_list=[ 'https://www.python.org/ftp/python/3.5.7/Python-3.5.7.tgz', 'https://www.python.org/ftp/python/3.7.3/python-3.7.3.exe', 'https://www.python.org/ftp/python/2.7.16/python-2.7.16.amd64.msi']large_url_list=[ #python地址,虽然不大,但国外地址相对较慢 'https://www.python.org/ftp/python/3.7.3/python-3.7.3.exe', # 'https://raw.githubusercontent.com/mymmsc/books/master/%E7%AE%97%E6%B3%95%E5%AF%BC%E8%AE%BA%E4%B8%AD%E6%96%87%E7%89%88.pdf', 'http://codown.youdao.com/cidian/YoudaoDict_webdict_default.exe', 'https://down.360safe.com/setup.exe', 'https://d1.music.126.net/dmusic/cloudmusicsetup_2.5.2.197409.exe', 'http://pcclient.download.youku.com/youkuclient/youkuclient_setup_7.7.7.4191.exe', 'http://dl2.xmind.cn/xmind-8-update8-windows.exe', 'https://cdn-dl.yinxiang.com/YXWin6/public/Evernote_6.17.20.667.exe',]url_list=[ 'https://images7.alphacoders.com/333/333388.jpg', 'https://images2.alphacoders.com/597/597309.jpg', 'https://images8.alphacoders.com/562/562449.jpg', 'https://images.alphacoders.com/562/562450.jpg', 'https://images3.alphacoders.com/562/562451.jpg', 'https://images.alphacoders.com/562/562452.jpg', 'https://images2.alphacoders.com/101/1011957.jpg', 'https://images6.alphacoders.com/101/1011958.jpg', 'https://images5.alphacoders.com/101/1011959.jpg', 'https://images8.alphacoders.com/101/1011961.jpg', 'https://images3.alphacoders.com/692/692439.jpg', 'https://images4.alphacoders.com/940/940881.jpg', 'https://images5.alphacoders.com/689/689398.jpg', 'https://images5.alphacoders.com/757/757038.jpg',]time_path='time_compare.txt'#请求url,保存图片def save_pic(url,count): # print(url) # print('save_pic',threading.current_thread()) file_name = (str(count+1) + '.jpg' ) res = requests.get(url) print(len(res.content)//1024//1024, url) with open(file_name,'wb') as f: f.write(res.content)#单线程def single_download(url_list): # print(threading.current_thread()) s_time=time.time() for i in range(len(url_list)): res=requests.get(url_list[i]) print(len(res.content)//1024//1024) file_name=str(i+1)+'.jpg' with open(file_name,'wb') as f: f.write(res.content) e_time=time.time() t_time=e_time-s_time # with open('single_download.txt','a') as f: with open(time_path,'a') as f: f.write('单线程总耗时:%r'%t_time+'\n'+'\n') print('单线程总耗时:%r'%t_time)#多线程def thread_download(save_pic,url_list): threads = [] start=time.time() for i in range(len(url_list)): #创建线程 t = Thread(target = save_pic, args = [url_list[i],i]) # t.setDaemon(True) t.start() threads.append(t) #每个线程按顺序逐个执行 # t.join() #多线程并发 # print('thread_download',threading.current_thread()) for t in threads: t.join() end = time.time() print('多线程总耗时:%r' % (end-start)) # with open('thread_download.txt','a') as f: with open(time_path,'a') as f: f.write('多线程总耗时:%r'%(end - start)+'\n')#多进程def process_download(save_pic,url_list): processes = [] start=time.time() for i in range(len(url_list)): #创建线程 p=multiprocessing.Process(target = save_pic, args = [url_list[i],i]) p.start() processes.append(p) #每个进程按顺序逐个执行 # p.join() # 多进程并发 # print('process_download',threading.currentThread()) for p in processes: p.join() end = time.time() print('多进程总耗时:%r' % (end-start)) # with open('thread_download.txt','a') as f: with open(time_path,'a') as f: f.write('多进程总耗时:%r'%(end - start)+'\n')if __name__ == '__main__': thread_download(save_pic,python_list) process_download(save_pic,python_list) single_download(large_url_list)
耗时对比:
多线程总耗时:22.477999925613403多进程总耗时:31.263000011444092单线程总耗时:25.10800004005432多线程总耗时:21.917999982833862多进程总耗时:28.180999994277954单线程总耗时:21.52900004386902多线程总耗时:6.33299994468689多进程总耗时:6.327999830245972单线程总耗时:21.680999994277954多线程总耗时:4.704999923706055多进程总耗时:7.363000154495239单线程总耗时:22.16599988937378多线程总耗时:4.493000030517578多进程总耗时:5.243000030517578单线程总耗时:20.289999961853027多线程总耗时:7.164999961853027多进程总耗时:6.3429999351501465单线程总耗时:40.97699999809265多线程总耗时:10.406000137329102多进程总耗时:11.692000150680542单线程总耗时:39.74600005149841多线程总耗时:11.069999933242798多进程总耗时:13.827999830245972单线程总耗时:55.35499978065491多线程总耗时:12.45300006866455多进程总耗时:15.381999969482422多线程总耗时:14.733000040054321多进程总耗时:17.787999868392944多线程总耗时:67.04800009727478多进程总耗时:65.76999998092651多线程总耗时:11.710999965667725多进程总耗时:13.263000011444092多线程总耗时:150.0369999408722多进程总耗时:87.61500000953674多线程总耗时:207.85199999809265多进程总耗时:85.44199991226196多线程总耗时:14.031000137329102多进程总耗时:16.914999961853027单线程总耗时:16.836000204086304多线程总耗时:16.92199993133545多进程总耗时:24.299000024795532单线程总耗时:20.825999975204468多线程总耗时:24.26200008392334多进程总耗时:25.591000080108643单线程总耗时:39.54299998283386多线程总耗时:42.15599989891052多进程总耗时:43.079999923706055多线程总耗时:45.169999837875366多进程总耗时:39.575000047683716多线程总耗时:50.48699998855591多进程总耗时:54.603999853134155多线程总耗时:55.680999994277954多进程总耗时:57.11299991607666多线程总耗时:51.34699988365173多线程总耗时:68.9359998703003多进程总耗时:60.924999952316284多线程总耗时:53.098999977111816多进程总耗时:55.61199998855591多线程总耗时:52.46000003814697多进程总耗时:51.26799988746643多线程总耗时:226.48599982261658多进程总耗时:211.4670000076294多线程总耗时:11.33299994468689多进程总耗时:15.307000160217285多线程总耗时:11.495000123977661多进程总耗时:11.54800009727478多线程总耗时:9.815999984741211多进程总耗时:10.997999906539917多线程总耗时:162.45900011062622多进程总耗时:180.01900005340576多线程总耗时:214.36699986457825多进程总耗时:157.90300011634827多线程总耗时:152.77100014686584多进程总耗时:136.43899989128113多线程总耗时:108.96199989318848多进程总耗时:104.80599999427795多线程总耗时:81.69500017166138多进程总耗时:82.85199999809265单线程总耗时:176.9119999408722