Software/Apache

From Notes

Contents


META

Between the <HEAD> and </HEAD> tags at the top of the HTML page.

Cascading Style Sheets

<LINK href="mystyle.css" rel="stylesheet" type="text/css">
<LINK href="mystyle.css" title="compact" rel="stylesheet" type="text/css">
<LINK href="mystyle.css" title="Medium" rel="alternate stylesheet" type="text/css">

Cookies

<META HTTP-EQUIV="Set-Cookie" CONTENT="cookievalue=xxx;expires=Friday, 31-Dec-99 23:59:59 GMT; path=/">

Force top window

<META HTTP-EQUIV="Window-target" CONTENT="_top">

Refresh

<META HTTP-EQUIV="Refresh" CONTENT="3;URL=http://www.some.org/some.html">

Description / Keywords

The search engines now usually look at a combination of key elements to determine your listings, not just your metas - some don't even look at them. There are two meta tags that can help your search engine listings - meta keywords & meta description.

<META NAME="description" content="This would be your description of what is on your page. Your most important
keyword phrases should appear in this description.">

Description meta, should not exceed 250 characters including spaces. Include 3-4 of your most important keyword phrases. Especially those used in your title tag and page copy. Try to have your most important keywords appear at the beginning of your description. This often brings better results, and will help avoid having any search engine cut off your keywords if they limit the length of your description.


<META NAME="keywords" content="keywords phrase 1, keyword phrase 2, keyword phrase 3, etc.">


Keywords Meta, should not exceed 1024 characters including spaces. Make sure you accurately describe the content of your page. Should only use those keyword phrases that you also used in the copy of your page, title tag, meta description, and other tags. Any keywords phrases that you use that do not appear in your other tags or page copy are likely to not have enough prominence to help your listings for that phrase. If you know of a common misspelling of a popular keyword that could be used to find your site you should enter it in your keywords meta tag. If your site has content of interest to a specific geographic location be sure to include the actual location in your keyword meta.

No Spam Notice

<meta name="no-email-collection" value="http://www.unspam.com/noemailcollection" />

or replace the [link to your terms] with a link to your terms of use page.

Javascript file

 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
     "http://www.w3.org/TR/html4/strict.dtd">
 <HTML>
 <HEAD>
 <TITLE>A document Title</TITLE>
 <SCRIPT type="text/javascript" src="includes/common.js"></SCRIPT>
 </HEAD>
 <BODY>
 The text of the webpage
    
 <BUTTON type="button" name="mybutton" value="10">
 <SCRIPT type="text/javascript">
      function my_onclick() {
         . . .
      }
    document.form.mybutton.onclick = my_onclick
 </SCRIPT>
 </BUTTON>
  
 
 <SCRIPT type="text/javascript">
 <!--  to hide script contents from old browsers
  function square(i) {
    document.write("The call passed ", i ," to the function.","<BR>")
    return i * i
  }
  document.write("The function returned ",square(5),".")
 // end hiding contents from old browsers  -->
 </SCRIPT>
 
 Here's a more interesting window handler:
    
 <SCRIPT type="text/javascript">
      function my_onload() {
         . . .
      }
  
      var win = window.open("some/other/URI")
      if (win) win.onload = my_onload
 </SCRIPT>
 <NOSCRIPT>
 <P>Access the <A href="http://someplace.com/data">data.</A>
 </NOSCRIPT>
 
 </BODY>
 </HTML>

HTACCESS

Access Log file shielding

Blocking posting spam or blog spam

Clever Examples

Stop remote Image 'hotlink' linkages

Example: stop 'hotlinking' redirect code for DIGG.COM or other referrers.

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^(.+ ) - [PT,L]
RewriteRule ^(.* ) index.php
#Blocking Digg...
RewriteCond %{HTTP_REFERER} digg\.com [NC]
RewriteRule .* - [F]
ErrorDocument 403 "<meta http-equiv=refresh content="0; url=http://digg.com/">
</IfModule>
 
OR

<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://([-a-z0-9]+\.)?example\.com[NC]
RewriteRule .*\.(zip|mp3|avi|wmv|mpg|mpeg)$ nohotlink.gif [R,NC,L]
</ifModule>
OR

RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?mydomain.com/.*$ [NC]
RewriteRule \.(gif|jpg)$ angryman.gif [R,L]

Stop remote webpage linkages from 'heavy use' sites

Example redirect code for other sites using your images.


<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?mydomain.com/.*$ [NC]
RewriteRule \.(gif|jpg|js|css)$ - [F]
</ifModule>
OR

<IfModule mod_rewrite.c>
RewriteEngine on
# Options +FollowSymlinks
RewriteCond %{HTTP_REFERER} badsite\.com [NC,OR]
RewriteCond %{HTTP_REFERER} anotherbadsite\.com
RewriteRule .* - [F] 
</ifModule>

Blocking Bad Bots - Fetchers

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
RewriteCond %{HTTP_USER_AGENT} ^Custo [OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR]
RewriteCond %{HTTP_USER_AGENT} ^larbin [OR]
RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR]
RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR]
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR]
RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR]
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR]
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus
RewriteRule .* - [F]
</ifModule>

Remove www from calling URL's

(For those folks who prefer the shorter URL's :)

RewriteEngine On
RewriteCond %{HTTP_HOST} ^www\.domain\.com$ [NC]
RewriteRule ^(.*)$ http://domain.com/$1 [R=301,L]

ROBOTS

(GURF)