Setelah sekian bulan “mangkrak” akhirnya jalan juga walau masih di mesin virtualbox.
Howto ini bukan untuk para pembenci youtube dan googlemap.
Tapi untuk youtube dan googlemap lovers.
bahan referensi yang jadi bacaan.
http://www.mail-archive.com/squid-users@squid-cache.org/msg54605.html
http://www.mail-archive.com/squid-users@squid-cache.org/msg51076.html
http://wiki.squid-cache.org/Features/StoreUrlRewrite
http://wiki.squid-cache.org/Features/StoreUrlRewrite/RewriteScript
Versi yang saya pakai adalah squid-2.7.STABLE3, tidak tahu dukungan untuk versi yang lain.
1. buat script untuk manipulasi youtube.
#!/usr/bin/perl
$|=1;
while (<>) {
@X = split;
$url = $X[0];
$url =~s@^http://(.*?)/get_video\?(.*)video_id=(.*?)&.*@squid://videos.youtube.INTERNAL/ID=$3@;
$url =~s@^http://(.*?)/get_video\?(.*)video_id=(.*?)$@squid://videos.youtube.INTERNAL/ID=$3@;
$url =~s@^http://(.*?)/videodownload\?(.*)docid=(.*?)$@squid://videos.google.INTERNAL/ID=$3@;
$url =~s@^http://(.*?)/videodownload\?(.*)docid=(.*?)&.*@squid://videos.google.INTERNAL/ID=$3@;
print "$url\n"; }
2. Lalu di squid.conf-nya edit seperti yang dibawah ini:
acl store_rewrite_list url_regex ^http://(.*?)/get_video\?
acl store_rewrite_list url_regex ^http://(.*?)/videodownload\?
cache allow store_rewrite_list
# Had to uncomment this again, because I couln'd login to google mail using IE6 (firefox had no trouble):
acl QUERY urlpath_regex cgi-bin \?
cache deny QUERY
refresh_pattern ^http://(.*?)/get_video\? 10080 90% 999999 override-expire ignore-no-cache ignore-private
refresh_pattern ^http://(.*?)/videodownload\? 10080 90% 999999 override-expire ignore-no-cache ignore-private
storeurl_access allow store_rewrite_list
storeurl_access deny all
storeurl_rewrite_program /usr/local/bin/store_url_rewrite
Hasilnya bisa dilihat di access-log, pada saat mengakses video yang sama, akan langsung hit.
# grep youtube access.log | grep TCP_HIT
1214834411.379 735 192.168.1.89 TCP_HIT/200 1604459 GET http://youtube.com/get_video?video_id=2d55B-SiJdM&t=OEgsToPDskKrwAAE_vVIhOqMhPqmPDUQ - NONE/- video/flv
1214834487.090 818 192.168.1.94 TCP_HIT/200 1604459 GET http://youtube.com/get_video?video_id=2d55B-SiJdM&t=OEgsToPDskLGVqEnxKjLEN4DGA3HYGse - NONE/- video/flv
1214836269.353 4383 192.168.1.91 TCP_HIT/200 9533167 GET http://youtube.com/get_video?video_id=i6cKRT12jgw&t=OEgsToPDskKeQxYVvYZ7fgEIW4UNC_U- - NONE/- video/flv
1214836514.802 3757 192.168.1.91 TCP_HIT/200 9533167 GET http://youtube.com/get_video?video_id=i6cKRT12jgw&t=OEgsToPDskIEwsTb26LiGFc96hBUUa9Z - NONE/- video/flv
Satu pesan dari Horacio Herrera Gonzalez, karena basic scriptnya tidak spesifik ke url tertentu, maka :
Warning! This code may match other sites not related to YT or GV.
He he he he, watching your bandwidth.
Karena beberapa user merasa kesulitan untuk mengaplied caching youtube.
Langkah dibawah adalah urutan di server saya.
- Saya pakai distro TSL 3.05, dengan squid squid-2.7.STABLE3
./configure \
--sysconfdir=/etc/squid \
--prefix=/usr \
--enable-async-io \
--enable-removal-policies=lru,heap \
--disable-delay-pools \
--disable-wccp \
--disable-wccp2 \
--enable-kill-parent-hack \
--enable-snmp \
--enable-default-err-languages=English --enable-err-languages=English \
--enable-linux-netfilter \
--disable-auth- config hasil parsing ^# dari squid.conf
acl all src all
acl manager proto cache_object
acl localhost src 127.0.0.1/32
acl to_localhost dst 127.0.0.0/8
acl localnet src 10.0.0.0/8 # RFC1918 possible internal network
acl localnet src 172.16.0.0/12 # RFC1918 possible internal network
acl localnet src 192.168.0.0/16 # RFC1918 possible internal network
acl SSL_ports port 443
acl Safe_ports port 80 # http
acl Safe_ports port 21 # ftp
acl Safe_ports port 443 # https
acl Safe_ports port 70 # gopher
acl Safe_ports port 210 # wais
acl Safe_ports port 1025-65535 # unregistered ports
acl Safe_ports port 280 # http-mgmt
acl Safe_ports port 488 # gss-http
acl Safe_ports port 591 # filemaker
acl Safe_ports port 777 # multiling http
acl CONNECT method CONNECT
http_access allow manager localhost
http_access deny manager
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports
http_access allow localnet
http_access deny all
icp_access allow localnet
icp_access deny all
http_port 3128 transparent
hierarchy_stoplist cgi-bin ?
cache_mem 6 MB
maximum_object_size_in_memory 32 KB
memory_replacement_policy heap GDSF
cache_replacement_policy heap LFUDA
cache_dir aufs /nfs/cache 20000 16 256
maximum_object_size 64 MB
cache_swap_low 98
cache_swap_high 99
access_log /var/log/squid/access.log squid
cache_log /var/log/squid/cache.log
cache_store_log none
log_fqdn off
storeurl_rewrite_program /etc/squid/store_url_rewrite
acl store_rewrite_list url_regex ^http://(.*?)/get_video\?
acl store_rewrite_list url_regex ^http://(.*?)/videodownload\?
storeurl_access allow store_rewrite_list
storeurl_access deny all
cache allow store_rewrite_list
acl QUERY urlpath_regex cgi-bin \?
cache deny QUERY
refresh_pattern ^http://(.*?)/get_video\? 10080 90% 999999 override-expire ignore-no-cache ignore-private
refresh_pattern ^http://(.*?)/videodownload\? 10080 90% 999999 override-expire ignore-no-cache ignore-private
refresh_pattern ^ftp: 1440 20% 10080
refresh_pattern ^gopher: 1440 0% 1440
refresh_pattern -i (/cgi-bin/|\?) 0 0% 0
refresh_pattern . 0 20% 4320
quick_abort_min 0
quick_abort_max 0
quick_abort_pct 98
acl apache rep_header Server ^Apache
broken_vary_encoding allow apache
vary_ignore_expire on
cache_effective_user squid
cache_effective_group squid
log_icp_queries off
ipcache_size 2048
ipcache_low 98
ipcache_high 99
memory_pools off
reload_into_ims on
coredump_dir /usr/var/cache
pipeline_prefetch on
Kontribusi apit (Ym-id relative_04), caching untuk photobucket yang banyak di pakai di friendster.
di store_url_rewrite
$url =~s@^http://(.*?)/albums\?&.*@squid://images.photobucket.INTERNAL/ID=$3@;
$url =~s@^http://(.*?)/albums\?$@squid://images.photobucket.INTERNAL/ID=$3@;
$url =~s@^http://(.*?)/albums\?&.*@squid://videos.photobucket.INTERNAL/ID=$3@;
$url =~s@^http://(.*?)/albums\?$@squid://videos.photobucket.INTERNAL/ID=$3@;
di squid.conf
acl store_rewrite_list url_regex ^http://i(.*?).photobucket.com/albums/(.*?)/(.*?)/(.*?)\?
acl store_rewrite_list url_regex ^http://vid(.*?).photobucket.com/albums/(.*?)/(.*?)\?
refresh_pattern ^http://i(.*?).photobucket.com/albums/(.*?)/(.*?)/(.*?)\? 43200 90% 999999 override-expire ignore-no-cache ignore-private
refresh_pattern ^http://vid(.*?).photobucket.com/albums/(.*?)/(.*?)\? 43200 90% 999999 override-expire ignore-no-cache ignore-private
Hasilnya
TCP_HIT/200 5474813 GET http://vid264.photobucket.com/albums/ii163/shannonwiseman12/DSCN0212.flv - NONE/- text/plain
Update script
Diperkirakan youtube merubah sistem mereka, sekitar quartal pertama tahun 2009.Akibatnya script diatas sudah tidak berfungsi, untuk mengatasinya perlu diubah script dan beberapa bagian di konfigurasi.
Untung saja sudah ada panduannya di http://wiki.squid-cache.org/ConfigExamples/DynamicContent/YouTube/Discussion
konfigurasi di bawah saya coba dimesin vmware dengan os centos 5.2, juli 2009
Untuk mempermudah saya sertakan squid.conf yang sudah dimodifikasi dan script url rewriternya.
acl all src all
acl manager proto cache_object
acl localhost src 127.0.0.1/32
acl to_localhost dst 127.0.0.0/8
acl localnet src 10.0.0.0/8
acl localnet src 172.16.0.0/12
acl localnet src 192.168.0.0/16
acl SSL_ports port 443
acl Safe_ports port 80
acl Safe_ports port 21
acl Safe_ports port 443
acl Safe_ports port 70
acl Safe_ports port 210
acl Safe_ports port 1025-65535
acl Safe_ports port 280
acl Safe_ports port 488
acl Safe_ports port 591
acl Safe_ports port 777
acl CONNECT method CONNECT
http_access allow manager localhost
http_access deny manager
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports
http_access allow localnet
http_access deny all
icp_access allow localnet
icp_access deny all
http_port 3128 transparent
hierarchy_stoplist cgi-bin ?
cache_mem 6 MB
maximum_object_size_in_memory 32 KB
memory_replacement_policy heap GDSF
cache_replacement_policy heap LFUDA
cache_dir aufs /cache 20000 16 256
maximum_object_size 64 MB
cache_swap_low 98
cache_swap_high 99
access_log /var/log/squid/access.log squid
cache_log /var/log/squid/cache.log
cache_store_log none
log_fqdn off
#storeurl_rewrite_program /etc/squid/store_url_rewrite
#acl store_rewrite_list url_regex ^http://(.*?)/get_video\?
#acl store_rewrite_list url_regex ^http://(.*?)/videoplayback\?
acl store_rewrite_list urlpath_regex \/(get_video\?|videodownload\?|videoplayback.*id) \.(jp(e?g|e|2)|gif|png|tiff?|bmp|ico|flv)\? \/ads\?
acl store_rewrite_list_web url_regex ^http:\/\/([A-Za-z-]+[0-9]+)*\.[A-Za-z]*\.[A-Za-z]*
acl store_rewrite_list_path urlpath_regex \.(jp(e?g|e|2)|gif|png|tiff?|bmp|ico|flv)$
acl store_rewrite_list_web_CDN url_regex ^http:\/\/[a-z]+[0-9]\.google\.com doubleclick\.net
acl QUERY2 urlpath_regex get_video\? videoplayback\? \.(jp(e?g|e|2)|gif|png|tiff?|bmp|ico|flv)\?
cache allow QUERY2
cache allow store_rewrite_list_web_CDN
acl QUERY urlpath_regex cgi-bin \?
cache deny QUERY
storeurl_access allow store_rewrite_list
#this is not related to youtube video its only for CDN pictures
storeurl_access allow store_rewrite_list_web_CDN
storeurl_access allow store_rewrite_list_web store_rewrite_list_path
storeurl_access deny all
#rewrite_program path is base on windows so use use your own path
storeurl_rewrite_program /etc/squid/cacheyoutube2.pl
storeurl_rewrite_children 1
storeurl_rewrite_concurrency 10
refresh_pattern ^http://(.*?)/get_video\? 10080 90% 999999 override-expire ignore-no-cache ignore-private
refresh_pattern ^http://(.*?)/videoplayback\? 10080 90% 999999 override-expire ignore-no-cache ignore-private
refresh_pattern -i (get_video\?|videoplayback\?id|videoplayback.*id) 161280 50000% 525948 override-expire ignore-reload
#and for pictures
refresh_pattern -i \.(jp(e?g|e|2)|gif|png|tiff?|bmp|ico|flv)(\?|$) 161280 3000% 525948 override-expire reload-into-ims
refresh_pattern ^ftp: 1440 20% 10080
refresh_pattern ^gopher: 1440 0% 1440
refresh_pattern -i (/cgi-bin/|\?) 0 0% 0
refresh_pattern . 0 20% 4320
quick_abort_min 0
quick_abort_max 0
quick_abort_pct 98
acl apache rep_header Server ^Apache
broken_vary_encoding allow apache
vary_ignore_expire on
cache_effective_user squid
cache_effective_group squid
log_icp_queries off
ipcache_size 2048
ipcache_low 98
ipcache_high 99
memory_pools off
reload_into_ims on
coredump_dir /usr/var/cache
pipeline_prefetch on
sedangkan untuk storeurl programnya sebagai berikut
isi file cacheyoutube2.pl
#!/usr/bin/perl
$|=1;
while (<>) {
@X = split;
$x = $X[0];
$_ = $X[1];
$u = $X[1];
if (m/^http:\/\/([0-9.]{4}|www\.youtube\.com|.*\.googlevideo\.com|.*\.video\.google\.com).*?(videoplayback\?id=.*?|video_id=.*?)\&(.*?)/) {
$z = $2; $z =~ s/video_id=/get_video?video_id=/; # compatible to old cached get_video?video_id
print $x . "http://video-srv.youtube.com.SQUIDINTERNAL/" . $z . "\n";
# new youtube
} elsif (m/^http:\/\/([0-9.]{4}|www\.youtube\.com|.*\.googlevideo\.com|.*\.video\.google\.com).*?\&(id=[a-zA-Z0-9]*)/) {
print $x . "http://video-srv.youtube.com.SQUIDINTERNAL/" . $2 . "\n";
} elsif (m/^http:\/\/www\.google-analytics\.com\/__utm\.gif\?.*/) {
print $x . "http://www.google-analytics.com/__utm.gif\n";
#cache high latency ads
} elsif (m/^http:\/\/(.*?)\/(ads)\?(.*?)/) {
print $x . "http://" . $1 . "/" . $2 . "\n";
# spicific servers starts here....
} elsif (m/^http:\/\/(www\.ziddu\.com.*\.[^\/]{3,4})\/(.*?)/) {
print $x . "http://" . $1 . "\n";
#rapidshare
} elsif ( ($u =~ /rapidshare/) && (m/^http:\/\/(([A-Za-z]+[0-9-.]+)*?)([a-z]*\.[^\/]{3}\/[a-z]*\/[0-9]*)\/(.*?)\/([^\/\?\&]{4,})$/)) {
print $x . "http://cdn." . $3 . "/SQUIDINTERNAL/" . $5 . "\n";
} elsif ( ($u =~ /maxporn/) && (m/^http:\/\/([^\/]*?)\/(.*?)\/([^\/]*?)(\?.*)?$/)) {
# $z = $1; $z =~ s/[A-Za-z]+[0-9-.]+/cdn/;
print $x . "http://" . $1 . "/SQUIDINTERNAL/" . $3 . "\n";
#like porn hub variables url and center part of the path, filename etention 3 or 4 with or withour ? at the end
} elsif ( ($u =~ /tube8|pornhub/) && (m/^http:\/\/(([A-Za-z]+[0-9-.]+)*?)\.([a-z]*[0-9]?\.[^\/]{3}\/[a-z]*)(.*?)((\/[a-z]*)?(\/[^\/]*){4}\.[^\/\?]{3,4})(\?.*)?$/)) {
print $x . "http://cdn." . $3 . $5 . "\n";
#...spicific servers end here.
#general purpose for cdn servers. add above your specific servers.
} elsif (m/^http:\/\/([0-9.]*?)\/\/(.*?)\.(.*)\?(.*?)/) {
print $x . "http://squid-cdn-url//" . $2 . "." . $3 . "\n";
#for yimg.com
} elsif (m/^http:\/\/(.*?)\.yimg\.com\/(.*?)\.yimg\.com\/(.*?)\?(.*?)/) {
print $x . "http://cdn.yimg.com/" . $3 . "\n";
#generic http://variable.domain.com/path/filename."ext" or "exte" with or withour "?"
} elsif (m/^http:\/\/( ([A-Za-z]+[0-9-.]+)*?)\.(.*?)\.(.*?)\/(.*?)\.([^\/\?\&]{3,4})(\?.*)?$/) {
print $x . "http://cdn." . $3 . "." . $4 . "/" . $5 . "." . $6 . "\n";
# generic http://variable.domain.com/...
} elsif (m/^http:\/\/( ([A-Za-z]+[0-9-.]+)*?)\.(.*?)\.(.*?)\/(.*)$/) {
print $x . "http://cdn." . $3 . "." . $4 . "/" . $5 . "\n";
# spicific extention that ends with ?
} elsif (m/^http:\/\/(.*?)\/(.*?)\.(jp(e?g|e|2)|gif|png|tiff?|bmp|ico|flv|on2)\?(.*)/) {
print $x . "http://" . $1 . "/" . $2 . "." . $3 . "\n";
# all that ends with ;
} elsif (m/^http:\/\/(.*?)\/(.*?)\;(.*)/) {
print $x . "http://" . $1 . "/" . $2 . "\n";
} else {
print $x . $_ . "\n";
}
}
Jangan lupa di chmod +x agar file perl-nya bisa di exekusi.
0 comments:
Post a Comment