Tuesday, January 22, 2008

Google Internship

Friday, January 11, 2008

Why sitemaps are important for search engine optimisation

What are Sitemaps?

Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site.

Web crawlers usually discover pages from links within the site and from other sites. Sitemaps supplement this data to allow crawlers that support Sitemaps to pick up all URLs in the Sitemap and learn about those URLs using the associated metadata. Using the Sitemap protocol does not guarantee that web pages are included in search engines, but provides hints for web crawlers to do a better job of crawling your site.

Sitemap 0.90 is offered under the terms of the Attribution-ShareAlike Creative Commons License and has wide adoption, including support from Google, Yahoo!, and Microsoft.

Sitemaps XML format

Jump to:
XML tag definitions
Entity escaping
Using Sitemap index files
Sitemap file location
Validating your Sitemap
Extending the Sitemaps protocol
Informing search engine crawlers

This document describes the XML schema for the Sitemap protocol.

The Sitemap protocol format consists of XML tags. All data values in a Sitemap must be entity-escaped. The file itself must be UTF-8 encoded.

The Sitemap must:

Begin with an opening <urlset> tag and end with a closing tag.
Specify the namespace (protocol standard) within the tag.
Include a <url> entry for each URL, as a parent XML tag.
Include a <loc> child entry for each parent tag.

All other tags are optional. Support for these optional tags may vary among search engines. Refer to each search engine's documentation for details.

Sample XML Sitemap

The following example shows a Sitemap that contains just one URL and uses all optional tags. The optional tags are in italics.


<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
 <loc>http://www.example.com/
 <lastmod>2005-01-01
 <changefreq>monthly
 <priority>0.8

Also see our example with multiple URLs.

XML tag definitions

The available XML tags are described below.

Attribute		Description
	required	Encapsulates the file and references the current protocol standard.
	required	Parent tag for each URL entry. The remaining tags are children of this tag.
	required	URL of the page. This URL must begin with the protocol (such as http) and end with a trailing slash, if your web server requires it. This value must be less than 2,048 characters.
	optional	The date of last modification of the file. This date should be in W3C Datetime format. This format allows you to omit the time portion, if desired, and use YYYY-MM-DD. Note that this tag is separate from the If-Modified-Since (304) header the server can return, and search engines may use the information from both sources differently.
	optional	How frequently the page is likely to change. This value provides general information to search engines and may not correlate exactly to how often they crawl the page. Valid values are: always hourly daily weekly monthly yearly never The value "always" should be used to describe documents that change each time they are accessed. The value "never" should be used to describe archived URLs. Please note that the value of this tag is considered a hint and not a command. Even though search engine crawlers may consider this information when making decisions, they may crawl pages marked "hourly" less frequently than that, and they may crawl pages marked "yearly" more frequently than that. Crawlers may periodically crawl pages marked "never" so that they can handle unexpected changes to those pages.
	optional	The priority of this URL relative to other URLs on your site. Valid values range from 0.0 to 1.0. This value does not affect how your pages are compared to pages on other sites—it only lets the search engines know which pages you deem most important for the crawlers. The default priority of a page is 0.5. Please note that the priority you assign to a page is not likely to influence the position of your URLs in a search engine's result pages. Search engines may use this information when selecting between URLs on the same site, so you can use this tag to increase the likelihood that your most important pages are present in a search index. Also, please note that assigning a high priority to all of the URLs on your site is not likely to help you. Since the priority is relative, it is only used to select between URLs on your site.

Entity escaping

Your Sitemap file must be UTF-8 encoded (you can generally do this when you save the file). As with all XML files, any data values (including URLs) must use entity escape codes for the characters listed in the table below.

Character		Escape Code
Ampersand	&	`&`
Single Quote	'	`'`
Double Quote	"	`"`
Greater Than	>	`>`
Less Than	<	`<`

In addition, all URLs (including the URL of your Sitemap) must be URL-escaped and encoded for readability by the web server on which they are located. However, if you are using any sort of script, tool, or log file to generate your URLs (anything except typing them in by hand), this is usually already done for you. Please check to make sure that your URLs follow the RFC-3986 standard for URIs, the RFC-3987 standard for IRIs, and the XML standard.

Below is an example of a URL that uses a non-ASCII character (ü), as well as a character that requires entity escaping (&):

http://www.example.com/ümlat.php&q=name

Below is that same URL, ISO-8859-1 encoded (for hosting on a server that uses that encoding) and URL escaped:

http://www.example.com/%FCmlat.php&q=name

Below is that same URL, UTF-8 encoded (for hosting on a server that uses that encoding) and URL escaped:

http://www.example.com/%C3%BCmlat.php&q=name

Below is that same URL, but also entity escaped:

http://www.example.com/%C3%BCmlat.php&q=name

Sample XML Sitemap

The following example shows a Sitemap in XML format. The Sitemap in the example contains a small number of URLs, each using a different set of optional parameters.


<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
 <loc>http://www.example.com/
 <lastmod>2005-01-01
 <changefreq>monthly
 <priority>0.8

<url>
 <loc>http://www.example.com/catalog?item=12&desc=vacation_hawaii
 <changefreq>weekly

<url>
 <loc>http://www.example.com/catalog?item=73&desc=vacation_new_zealand
 <lastmod>2004-12-23
 <changefreq>weekly

<url>
 <loc>http://www.example.com/catalog?item=74&desc=vacation_newfoundland
 <lastmod>2004-12-23T18:00:15+00:00
 <priority>0.3

<url>
 <loc>http://www.example.com/catalog?item=83&desc=vacation_usa
 <lastmod>2004-11-23

Using Sitemap index files (to group multiple sitemap files)

You can provide multiple Sitemap files, but each Sitemap file that you provide must have no more than 50,000 URLs and must be no larger than 10MB (10,485,760 bytes). If you would like, you may compress your Sitemap files using gzip to stay within 10MB and reduce your bandwidth requirement. If you want to list more than 50,000 URLs, you must create multiple Sitemap files.

If you do provide multiple Sitemaps, you should then list each Sitemap file in a Sitemap index file. Sitemap index files may not list more than 1,000 Sitemaps and must be no larger than 10MB (10,485,760 bytes). The XML format of a Sitemap index file is very similar to the XML format of a Sitemap file.

The Sitemap index file must:

Begin with an opening <sitemapindex> tag and end with a closing tag.
Include a <sitemap> entry for each Sitemap as a parent XML tag.
Include a <loc> child entry for each parent tag.

The optional <lastmod> tag is also available for Sitemap index files.

Note: A Sitemap index file can only specify Sitemaps that are found on the same site as the Sitemap index file. For example, http://www.yoursite.com/sitemap_index.xml can include Sitemaps on http://www.yoursite.com but not on http://www.example.com or http://yourhost.yoursite.com. As with Sitemaps, your Sitemap index file must be UTF-8 encoded.

Sample XML Sitemap Index

The following example shows a Sitemap index that lists two Sitemaps:


<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
 <loc>http://www.example.com/sitemap1.xml.gz
 <lastmod>2004-10-01T18:23:17+00:00

<sitemap>
 <loc>http://www.example.com/sitemap2.xml.gz
 <lastmod>2005-01-01

Note: Sitemap URLs, like all values in your XML files, must be entity escaped.

Sitemap Index XML Tag Definitions

Attribute		Description
	required	Encapsulates information about all of the Sitemaps in the file.
	required	Encapsulates information about an individual Sitemap.
	required	Identifies the location of the Sitemap. This location can be a Sitemap, an Atom file, RSS file or a simple text file.
	optional	Identifies the time that the corresponding Sitemap file was modified. It does not correspond to the time that any of the pages listed in that Sitemap were changed. The value for the lastmod tag should be in W3C Datetime format. By providing the last modification timestamp, you enable search engine crawlers to retrieve only a subset of the Sitemaps in the index i.e. a crawler may only retrieve Sitemaps that were modified since a certain date. This incremental Sitemap fetching mechanism allows for the rapid discovery of new URLs on very large sites.

Other Sitemap formats

The Sitemap protocol enables you to provide details about your pages to search engines, and we encourage its use since you can provide additional information about site pages beyond just the URLs. However, in addition to the XML protocol, we support RSS feeds and text files, which provide more limited information.

Syndication feed

You can provide an RSS (Real Simple Syndication) 2.0 or Atom 0.3 or 1.0 feed. Generally, you would use this format only if your site already has a syndication feed. Note that this method may not let search engines know about all the URLs in your site, since the feed may only provide information on recent URLs, although search engines can still use that information to find out about other pages on your site during their normal crawling processes by following links inside pages in the feed. Make sure that the feed is located in the highest-level directory you want search engines to crawl. Search engines extract the information from the feed as follows:

field - indicates the URL
modified date field (the field for RSS feeds and the date for Atom feeds) - indicates when each URL was last modified. Use of the modified date field is optional.

Text file

You can provide a simple text file that contains one URL per line. The text file must follow these guidelines:

The text file must have one URL per line. The URLs cannot contain embedded new lines.
You must fully specify URLs, including the http.
Each text file can contain a maximum of 50,000 URLs. If you site includes more than 50,000 URLs, you can separate the list into multiple text files and add each one separately.
The text file must use UTF-8 encoding. You can specify this when you save the file (for instance, in Notepad, this is listed in the Encoding menu of the Save As dialog box).
The text file should contain no information other than the list of URLs.
The text file should contain no header or footer information.
You can name the text file anything you wish.
You should upload the text file to the highest-level directory you want search engines to crawl and make sure that you don't list URLs in the text file that are located in a higher-level directory.

Sample text file entries are shown below.

http://www.example.com/catalog?item=1
http://www.example.com/catalog?item=11

Sitemap file location

The location of a Sitemap file determines the set of URLs that can be included in that Sitemap. A Sitemap file located at http://example.com/catalog/sitemap.xml can include any URLs starting with http://example.com/catalog/ but can not include URLs starting with http://example.com/images/.

If you have the permission to change http://example.org/path/sitemap.xml, it is assumed that you also have permission to provide information for URLs with the prefix http://example.org/path/. Examples of URLs considered valid in http://example.com/catalog/sitemap.xml include:

http://example.com/catalog/show?item=23
http://example.com/catalog/show?item=233&user=3453

URLs not considered valid in http://example.com/catalog/sitemap.xml include:

http://example.com/image/show?item=23
http://example.com/image/show?item=233&user=3453
https://example.com/catalog/page1.php

Note that this means that all URLs listed in the Sitemap must use the same protocol (http, in this example) and reside on the same host as the Sitemap. For instance, if the Sitemap is located at http://www.example.com/sitemap.xml, it can't include URLs from http://subdomain.example.com.

URLs that are not considered valid are dropped from further consideration. It is strongly recommended that you place your Sitemap at the root directory of your web server. For example, if your web server is at example.com, then your Sitemap index file would be at http://example.com/sitemap.xml. In certain cases, you may need to produce different Sitemaps for different paths (e.g., if security permissions in your organization compartmentalize write access to different directories).

If you submit a Sitemap using a path with a port number, you must include that port number as part of the path in each URL listed in the Sitemap file. For instance, if your Sitemap is located at http://www.example.com:100/sitemap.xml, then each URL listed in the Sitemap must begin with http://www.example.com:100.

Validating your Sitemap

The following XML schemas define the elements and attributes that can appear in your Sitemap file. You can download this schema from the links below:

For Sitemaps: http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd
For Sitemap index files: http://www.sitemaps.org/schemas/sitemap/0.9/siteindex.xsd

There are a number of tools available to help you validate the structure of your Sitemap based on this schema. You can find a list of XML-related tools at each of the following locations:

http://www.w3.org/XML/Schema#Tools
http://www.xml.com/pub/a/2000/12/13/schematools.html

In order to validate your Sitemap or Sitemap index file against a schema, the XML file will need additional headers as shown below.

Sitemap:



  
     

         xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
         http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"
         xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
   
      ...

Sitemap index file:

...

Extending the Sitemaps protocol

You can extend the Sitemaps protocol using your own namespace. Simply specify this namespace in the root element. For example:

Informing search engine crawlers

Once you have created the Sitemap file and placed it on your webserver, you need to inform the search engines that support this protocol of its location. You can do this by:

The search engines can then retrieve your Sitemap and make the URLs available to their crawlers.

Submitting your Sitemap via the search engine's submission interface

To submit your Sitemap directly to a search engine, which will enable you to receive status information and any processing errors, refer to each search engine's documentation.

Specifying the Sitemap location in your robots.txt file

You can specify the location of the Sitemap using a robots.txt file. To do this, simply add the following line:

Sitemap:

The should be the complete URL to the Sitemap, such as: http://www.example.com/sitemap.xml

This directive is independent of the user-agent line, so it doesn't matter where you place it in your file. If you have a Sitemap index file, you can include the location of just that file. You don't need to list each individual Sitemap listed in the index file.

Submitting your Sitemap via an HTTP request

To submit your Sitemap using an HTTP request (replace with the URL provided by the search engine), iIssue your request to the following URL:

/ping?sitemap=sitemap_url

For example, if your Sitemap is located at http://www.example.com/sitemap.gz, your URL will become:

/ping?sitemap=http://www.example.com/sitemap.gz

URL encode everything after the /ping?sitemap=:

/ping?sitemap=http%3A%2F%2Fwww.yoursite.com%2Fsitemap.gz

You can issue the HTTP request using wget, curl, or another mechanism of your choosing. A successful request will return an HTTP 200 response code; if you receive a different response, you should resubmit your request. The HTTP 200 response code only indicates that the search engine has received your Sitemap, not that the Sitemap itself or the URLs contained in it were valid. An easy way to do this is to set up an automated job to generate and submit Sitemaps on a regular basis.
Note: If you are providing a Sitemap index file, you only need to issue one HTTP request that includes the location of the Sitemap index file; you do not need to issue individual requests for each Sitemap listed in the index.

sitemap.xml

Thursday, January 10, 2008

Self Improvement and Confidence

1. Be confident. If you don't believe in yourself, why would anyone else? We all have something we are good at. Have faith in yourself and try to work on your niche skills. Drive the fear of failure far away from you. Remember, what doesn't kill you, only makes you stronger!

2. Be clear. Fuzzy, undefined goals are difficult to focus on. How will you proceed when the path seems all foggy? Request your manager for defined, measurable objectives and tasks.

If your manager is not very forthcoming, take some initiative and work with him until you have clarity about your role and what you will be appraised on, at the end of the year. Self-motivated people work best with clearly defined objectives in life. Even if the targets seem a little hazy, house-bred motivation can come in real handy!

3. Work on yourself. Nothing works better as a power shot of motivation than the knowledge that you are good at what you do! Be on top of things at work. Identify your weak areas and get them out of the way. Enroll in courses that will raise your market value and also your motivation levels. Getting a few certifications and qualifications in your functional skills will definitely instill a great deal of confidence.

4. Take criticism positively. Even though the other person has no such intentions, turn all negative criticism into a positive driving force. Failure is a state a mind. If you think you can succeed, you will. Always think positive. That way, instead of brooding over past disappointments, you will route your frustration into positive energy required for working harder. It works like magic.

"My boss was always running me down. Even when I did a good job, he never praised me. Initially I used to feel terrible and slowly started to look for excuses to avoid official meetings when he would once again find reasons to discourage me. I contemplated resigning and finding a new job where I did not have to prove myself over and over again," recounts MS. "But then, there were no guarantees that my next boss would be better to work with. So, I took it up as a challenge. I started reporting to work early and always finished my tasks before time. My team members started to respect me more because I helped them when my work was done. Gradually, my boss took a back step when he realized that I was now a highly productive member of the team. If I had not been self motivated to prove a point, thoughts of having failed and run away would have chased me forever."

5. Look out for challenges. If the current job unmotivated you, not to worry. Be open to try out new things if your present role has become too boring to continue even a day more. Talk to your seniors to redefine your role to optimize your capabilities. Establish your reputation as somebody who is not scared to take on new challenges in life.

6. Be persistent. Most things may not work out right the first time. This just means that you need to try harder. However, ensure that you set your heart on goals that are really important to you and will help you progress in life. Save your efforts for things that matter. Do not waste your energies on peripheral things.

7. Keep the company of successful people. Try to surround yourself with confident people who are driven and high on life. Read books that fill you with optimism. Put up motivating posters and quotes on your workstation that will spread positive energy and drive away any depressing thoughts. Look around for successful people and try to emulate them. Find out what makes them tick and include that in your working style.

8. Celebrate life. If something doesn't shape up like you thought it would, it does not mean everything else is doomed as well. Do not feel stressed; high stress leads to low motivation. Take active interest in things happening around you. Live your life well. Continue to have faith in yourself and get involved in things that give you happiness. That itself will generate enough motivation for you to glide over waves of setbacks.

9. Start today. List all that is important for you to achieve your goals. Divide long-term goals into smaller milestones and celebrate each accomplished goal. Procrastination is a killer so keep it at bay.

10. Keep dreaming. Lastly, do not forget to keep dreaming. Dream big! Let your dreams fuel your desire to get closer to your goals. Write your dreams for yourself in a diary or a journal and constantly refer to them so that you do not forget or lose sight of the objective.

--
If there is ways, then i will find one and if there is no way then i will make one.

Wednesday, January 9, 2008

Difference Between Primary Key And Unique key in RDMS

1) Unique key can be null but primary key cant be null.

2) Primary key can be referenced to other table as Foreign Key.

3) we can have multiple unique key in a table but PK is one and only
one.

4) Primary key in itself is unique key.

What is the difference between a primary key and a surrogate key?

A primary key is a special constraint on a column or set of columns.
A primary key constraint ensures that the column(s) so designated have
no NULL values, and that every value is unique. Physically, a primary
key is implemented by the database system using a unique index, and
all the columns in the primary key must have been declared NOT NULL. A
table may have only one primary key, but it may be composite (consist
of more than one column).

A surrogate key is any column or set of columns that can be declared
as the primary key instead of a "real" or natural key. Sometimes there
can be several natural keys that could be declared as the primary key,
and these are all called candidate keys. So a surrogate is a candidate
key. A table could actually have more than one surrogate key, although
this would be unusual. The most common type of surrogate key is an
incrementing integer, such as an auto_increment column in MySQL, or a
sequence in Oracle, or an identity column in SQL Server.

Monday, January 7, 2008

New Rules to be kept by ICC after India's defeat in Sydney Test

After watching the test match, someone has written some rules have to be incorporated by ICC to give the other teams a perfect clarification

(1) Ricky Ponting – (THE TRULY GENUINE CRICKETER OF THE CRICKET ERA AND WHOSE INTEGRITY SHOULD NOT BE DOUBTED) should be considered as the FOURTH UMPIRE. As per the new rules, FOURTH UMPIRE decision is final and will over ride any decisions taken by any other umpires. ON-FIELD umpires can seek the assistance of RICKY PONTING even if he is not on the field. This rule is to be made, so that every team should understand the importance of the FOURTH UMPIRE.

(2) While AUSTRALIAN TEAM is bowling, If the ball flies anywhere close to the AUSTRALIAN FIELDER(WITHIN 5 metre distance), the batsman is to be considered OUT irrelevant of whether the catch was taken cleanly or grassed. Any decision for further clarification should be seeked from the FOURTH UMPIRE. This is made to ensure that the cricket is played with SPORTIVE SPIRIT by all the teams.

(3) While BATTING, AUSTRALIAN players will wait for the ON-FIELD UMPIRE decisions only (even if the catch goes to the FIFTH SLIP as the ball might not have touched the bat). Each AUSTRALIAN batsman has to be out FOUR TIMES (minimum) before he can return to the pavilion. In case of THE CRICKETER WITH INTEGRITY, this can be higher.

(4) UMPIRES should consider a huge bonus if an AUSTRALIAN player scores a century. Any wrong decisions can be ignored as they will be paid huge bonus and will receive the backing of the AUSTRALIAN team and board.

(5) All AUSTRALIAN players are eligible to keep commenting about all players on the field and the OPPONENT TEAM should never comment as they will be spoiling the spirit of the AUSTRALIAN team. Any comments made in any other language are to be considered as RACISM only.

(6) MATCH REFREE decisions will be taken purely on the AUSTRALIAN TEAM advice only. Player views from the other teams decisions will not be considered for hearing. MATCH REFREES are to be given huge bonus if this rule is implemented.

(7) NO VISITING TEAM should plan to win in AUSTRALIA. This is to ensure that the sportive spirit of CRICKET is maintained.

(8) THE MOST IMPORTANT RULE: If any bowler gets RICKY PONTING - “THE UNDISPUTED CRICKETER WITH INTEGTIRY IN THE GAME OF CRICKET” more than twice in a series, he will be banned for the REST OF THE SERIES. This is to ensure that the best batsman/Captain will be played to break records and create history in the game of CRICKET.

These rules will clarify better to the all the teams VISITING AUSTRALIA.

Friday, January 4, 2008

Smart tips to invesst in stock market

Here we are discuss abt sharemarket... My all scripts are double in
10 days*..so if u want more script name then join this group today and
get best return. Here is some Script names.
1.Alka India
2.Harig Cranksaft
3.Pentasoft Tech
4.Facor Steel
5.Pennar Alum

Once Check and see How life Changed.

Small Cap Companies:

Name	Suggested Price	Target Price	One Month Return (%)
Alka India	1.11	9	110
Harig Crank	5.21	26	120
Pentasoft Tech	2.82	22	109
Silicon Vally	2.65	12	107
Facor Alloys	11	56	98
Facor Steel	9	45	101
Pennar Alum	6.4	21	99

Large Cap Companies:

Name	Suggested Price	Target Price
Gmr Infra	221	365
Reliance Natural	149	290
Tata Chemical	401	525
Mosebear	302	380
Vijaya Bank	70	120
GTL Infra	82	210

Name	Suggested Price	Target Price	Price on 31-12-2007
Reliance Energy	1365	2200	2229
Essar Oil	53	190	324
Silverline	18	95	132
Pentamedia Gra	3	22	12
Reliance Comm	510	740	741
Noida Toll	42	78	79

Creating a Backup for Your Google Account

Using a single account for all the Google has a lot of advantages but if, for some reason, you can't access the account or Google temporarily disables it, you lose a lot of important data. Fortunately, you can set up a Google account that should give you access to some of the information from your account. (You should also backup important data in other ways: download Gmail messages using POP3/IMAP in a mail client, export your documents from Google Docs, back up your Blogger blogs etc.)

* If you use Gmail, you could create a Gmail account whose only purpose is to fetch messages from your main account. Set up mail fetcher in the backup account and add the main account as a custom From address. This way, you'll be able to read all the messages from your account and even send mail.

* Add the backup account as a Google Talk friend from Gmail Chat or from other Google Talk interface. As a side effect, you'll have access to your shared items from Google Reader.

* For Blogger, add the backup account in the blog authors section: Settings > Permissions > Add authors. The account should have admin privileges so that you can create, edit and delete posts.

* In Google Analytics, go to Access Manager and add the account as an admin. You'll have access to all reports and profiles in the backup account.

* Google Calendar lets you share the main calendar with other people and even give them the right to edit events. Click on "Manage calendars" at the bottom of the window, share the main calendar and add the backup account. You should select "make changes and manage sharing" from the drop-down.

* If you're the owner of a group in Google Groups, go to the member invitation section, select "Add members directly" and add the backup account. Then change the membership type of the new account to "owner". It's also a good idea to select "no email" in the subscription type.

* Add the backup account as a collaborator for some of the most important Google documents and notebooks.

* Other Google services only allow you to export your data: Google Reader (Settings > Import/Export), iGoogle (share each tab with the backup account), Gmail contacts, Google News personalization (scroll to the bottom of the homepage and click on "Share your personalized news with a friend").

The backup account will not have all the data from your main account, but you'll still be able to read your email, send messages, post blog posts, check your calendar, add new events, access important documents etc.

Easy Way to Find Recent Web Pages with Google

Now that Google indexes pages extremely fast and saves the date of the first indexing, it would be nice to have more options for restricting search results to a date range. Google only provided three options in the advanced search: see all the pages last updated in the past 3, 6 or 12 months and a difficult-to-use operator (daterange).

The advanced search page has been updated and it shows four more options: find the web pages first indexed in the past day, week, month or in the past 2 months.

If you remove all the uninteresting parameters from the search URL, you'll find that as_qdr is responsible for date restrictions. For example, here's how to restrict a search for [China] to pages first seen by Google's crawler in the past 24 hours:

http://www.google.com/search?q=china&as_qdr=d

Note that you'll only find new web pages and not pages that were updated in the past 24 hours. That means you won't find homepages from popular sites or other frequently-updated pages. If the date range is small, you'll mostly find news and blog posts.

The nice thing is that you can change the value of as_qdr to custom intervals. Here are all the possible values of the as_qdr parameter:

d[number] - past number of days (e.g.: d10)
w[number] - past number of weeks
y[number] - past number of years

For example, http://www.google.com/search?q=china&as_qdr=d10 lets you search for pages that contain "China" and were created in the past 10 days.

A finer control (hours) and an option to sort the results by date would make this feature almost perfect.

Thursday, January 3, 2008

Best Practices for Speeding Up Your Web Site

High Performance Web Sites
by Steve Souders,
Chief Performance Yahoo!
available from
O'Reilly Media

The Importance of Front-End Performance

High Performance Web Sites: The Importance of Front-End Performance

In 2004, I started the Exceptional Performance group at Yahoo!. We're a small team chartered to measure and improve the performance of Yahoo!'s products. Having worked as a back-end engineer most of my career, I approached this as I would a code optimization project - I profiled web performance to identify where there was the greatest opportunity for improvement. Since our goal is to improve the end-user experience, I measured response times in a browser over various bandwidth speeds. What I saw is illustrated in the following chart showing HTTP traffic for http://www.yahoo.com.

In the figure above, the first bar, labeled "html", is the initial request for the HTML document. In this case, only 5% of the end-user response time is spent fetching the HTML document. This result holds true for almost all web sites. In sampling the top ten U.S. websites, all but one spend less than 20% of the total response time getting the HTML document. The other 80+% of the time is spent dealing with what's in the HTML document, namely, the front-end. That's why the key to faster web sites is to focus on improving front-end performance.

There are three main reasons why front-end performance is the place to start.

There is more potential for improvement by focusing on the front-end. Cutting it in half reduces response times by 40% or more, whereas cutting back-end performance in half results in less than a 10% reduction.
Front-end improvements typically require less time and resources than back-end projects (redesigning application architecture and code, finding and optimizing critical code paths, adding or modifying hardware, distributing databases, etc.).
Front-end performance tuning has been proven to work. Over fifty teams at Yahoo! have reduced their end-user response times by following our performance best practices, often by 25% or more.

Our performance golden rule is: optimize front-end performance first, that's where 80% or more of the end-user response time is spent.

Discuss the Importance of Front-End Performance

1: Minimize HTTP Requests

80% of the end-user response time is spent on the front-end. Most of this time is tied up in downloading all the components in the page: images, stylesheets, scripts, Flash, etc. Reducing the number of components in turn reduces the number of HTTP requests required to render the page. This is the key to faster pages.

One way to reduce the number of components in the page is to simplify the page's design. But is there a way to build pages with richer content while also achieving fast response times? Here are some techniques for reducing the number of HTTP requests, while still supporting rich page designs.

Image maps combine multiple images into a single image. The overall size is about the same, but reducing the number of HTTP requests speeds up the page. Image maps only work if the images are contiguous in the page, such as a navigation bar. Defining the coordinates of image maps can be tedious and error prone.

CSS Sprites are the preferred method for reducing the number of image requests. Combine all the images in your page into a single image and use the CSS background-image and background-position properties to display the desired image segment.

Inline images use the data: URL scheme to embed the image data in the actual page. This can increase the size of your HTML document. Combining inline images into your (cached) stylesheets is a way to reduce HTTP requests and avoid increasing the size of your pages.

Combined files are a way to reduce the number of HTTP requests by combining all scripts into a single script, and similarly combining all stylesheets into a single stylesheet. It's a simple idea that hasn't seen wide adoption. The ten top U.S. web sites average 7 scripts and 2 stylesheets per page. Combining files is more challenging when the scripts and stylesheets vary from page to page, but making this part of your release process improves response times.

Reducing the number of HTTP requests in your page is the place to start. This is the most important guideline for improving performance for first time visitors. As described in Tenni Theurer's blog Browser Cache Usage - Exposed!, 40-60% of daily visitors to your site come in with an empty cache. Making your page fast for these first time visitors is key to a better user experience.

Discuss Rule 1

2: Use a Content Delivery Network

The user's proximity to your web server has an impact on response times. Deploying your content across multiple, geographically dispersed servers will make your pages load faster from the user's perspective. But where should you start?

As a first step to implementing geographically dispersed content, don't attempt to redesign your web application to work in a distributed architecture. Depending on the application, changing the architecture could include daunting tasks such as synchronizing session state and replicating database transactions across server locations. Attempts to reduce the distance between users and your content could be delayed by, or never pass, this application architecture step.

Remember that 80-90% of the end-user response time is spent downloading all the components in the page: images, stylesheets, scripts, Flash, etc. This is the Performance Golden Rule, as explained in The Importance of Front-End Performance. Rather than starting with the difficult task of redesigning your application architecture, it's better to first disperse your static content. This not only achieves a bigger reduction in response times, but it's easier thanks to content delivery networks.

A content delivery network (CDN) is a collection of web servers distributed across multiple locations to deliver content more efficiently to users. The server selected for delivering content to a specific user is typically based on a measure of network proximity. For example, the server with the fewest network hops or the server with the quickest response time is chosen.

Some large Internet companies own their own CDN, but it's cost-effective to use a CDN service provider, such as Akamai Technologies, Mirror Image Internet, or Limelight Networks. For start-up companies and private web sites, the cost of a CDN service can be prohibitive, but as your target audience grows larger and becomes more global, a CDN is necessary to achieve fast response times. At Yahoo!, properties that moved static content off their application web servers to a CDN improved end-user response times by 20% or more. Switching to a CDN is a relatively easy code change that will dramatically improve the speed of your web site.

Discuss Rule 2

3: Add an Expires Header

Web page designs are getting richer and richer, which means more scripts, stylesheets, images, and Flash in the page. A first-time visitor to your page may have to make several HTTP requests, but by using the Expires header you make those components cacheable. This avoids unnecessary HTTP requests on subsequent page views. Expires headers are most often used with images, but they should be used on all components including scripts, stylesheets, and Flash components.

Browsers (and proxies) use a cache to reduce the number and size of HTTP requests, making web pages load faster. A web server uses the Expires header in the HTTP response to tell the client how long a component can be cached. This is a far future Expires header, telling the browser that this response won't be stale until April 15, 2010.

      Expires: Thu, 15 Apr 2010 20:00:00 GMT

If your server is Apache, use the ExiresDefault directive to set an expiration date relative to the current date. This example of the ExpiresDefault directive sets the Expires date 10 years out from the time of the request.

      ExpiresDefault "access plus 10 years"

Keep in mind, if you use a far future Expires header you have to change the component's filename whenever the component changes. At Yahoo! we often make this step part of the build process: a version number is embedded in the component's filename, for example, yahoo_2.0.6.js.

Using a far future Expires header affects page views only after a user has already visited your site. It has no effect on the number of HTTP requests when a user visits your site for the first time and the browser's cache is empty. The impact of this performance improvement depends, therefore, on how often users hit your pages with a primed cache. (A "primed cache" already contains all of the components in the page.) We measured this at Yahoo! and found the number of page views with a primed cache is 75-85%. By using a far future Expires header, you increase the number of components that are cached by the browser and re-used on subsequent page views without sending a single byte over the user's Internet connection.

Discuss Rule 3

4: Gzip Components

The time it takes to transfer an HTTP request and response across the network can be significantly reduced by decisions made by front-end engineers. It's true that the end-user's bandwidth speed, Internet service provider, proximity to peering exchange points, etc. are beyond the control of the development team. But there are other variables that affect response times. Compression reduces response times by reducing the size of the HTTP response.

Starting with HTTP/1.1, web clients indicate support for compression with the Accept-Encoding header in the HTTP request.

      Accept-Encoding: gzip, deflate

If the web server sees this header in the request, it may compress the response using one of the methods listed by the client. The web server notifies the web client of this via the Content-Encoding header in the response.

      Content-Encoding: gzip

Gzip is the most popular and effective compression method at this time. It was developed by the GNU project and standardized by RFC 1952. The only other compression format you're likely to see is deflate, but it's less effective and less popular.

Gzipping generally reduces the response size by about 70%. Approximately 90% of today's Internet traffic travels through browsers that claim to support gzip. If you use Apache, the module configuring gzip depends on your version: Apache 1.3 uses mod_gzip while Apache 2.x uses mod_deflate.

There are known issues with browsers and proxies that may cause a mismatch in what the browser expects and what it receives with regard to compressed content. Fortunately, these edge cases are dwindling as the use of older browsers drops off. The Apache modules help out by adding appropriate Vary response headers automatically.

Servers choose what to gzip based on file type, but are typically too limited in what they decide to compress. Most web sites gzip their HTML documents. It's also worthwhile to gzip your scripts and stylesheets, but many web sites miss this opportunity. In fact, it's worthwhile to compress any text response including XML and JSON. Image and PDF files should not be gzipped because they are already compressed. Trying to gzip them not only wastes CPU but can potentially increase file sizes.

Gzipping as many file types as possible is an easy way to reduce page weight and accelerate the user experience.

Discuss Rule 4

5: Put Stylesheets at the Top

While researching performance at Yahoo!, we discovered that moving stylesheets to the document HEAD makes pages load faster. This is because putting stylesheets in the HEAD allows the page to render progressively.

Front-end engineers that care about performance want a page to load progressively; that is, we want the browser to display whatever content it has as soon as possible. This is especially important for pages with a lot of content and for users on slower Internet connections. The importance of giving users visual feedback, such as progress indicators, has been well researched and documented. In our case the HTML page is the progress indicator! When the browser loads the page progressively the header, the navigation bar, the logo at the top, etc. all serve as visual feedback for the user who is waiting for the page. This improves the overall user experience.

The problem with putting stylesheets near the bottom of the document is that it prohibits progressive rendering in many browsers, including Internet Explorer. Browsers block rendering to avoid having to redraw elements of the page if their styles change. The user is stuck viewing a blank white page. Firefox doesn't block rendering, which means when the stylesheet is done loading it's possible elements in the page will have to be redrawn, resulting in the flash of unstyled content problem.

The HTML specification clearly states that stylesheets are to be included in the HEAD of the page: "Unlike A, [LINK] may only appear in the HEAD section of a document, although it may appear any number of times." Neither of the alternatives, the blank white screen or flash of unstyled content, are worth the risk. The optimal solution is to follow the HTML specification and load your stylesheets in the document HEAD.

Discuss Rule 5

6: Put Scripts at the Bottom

Rule 5 described how stylesheets near the bottom of the page prohibit progressive rendering, and how moving them to the document HEAD eliminates the problem. Scripts (external JavaScript files) pose a similar problem, but the solution is just the opposite: it's better to move scripts from the top to as low in the page as possible. One reason is to enable progressive rendering, but another is to achieve greater download parallelization.

With stylesheets, progressive rendering is blocked until all stylesheets have been downloaded. That's why it's best to move stylesheets to the document HEAD, so they get downloaded first and rendering isn't blocked. With scripts, progressive rendering is blocked for all content below the script. Moving scripts as low in the page as possible means there's more content above the script that is rendered sooner.

The second problem caused by scripts is blocking parallel downloads. The HTTP/1.1 specification suggests that browsers download no more than two components in parallel per hostname. If you serve your images from multiple hostnames, you can get more than two downloads to occur in parallel. (I've gotten Internet Explorer to download over 100 images in parallel.) While a script is downloading, however, the browser won't start any other downloads, even on different hostnames.

In some situations it's not easy to move scripts to the bottom. If, for example, the script uses document.write to insert part of the page's content, it can't be moved lower in the page. There might also be scoping issues. In many cases, there are ways to workaround these situations.

An alternative suggestion that often comes up is to use deferred scripts. The DEFER attribute indicates that the script does not contain document.write, and is a clue to browsers that they can continue rendering. Unfortunately, Firefox doesn't support the DEFER attribute. In Internet Explorer, the script may be deferred, but not as much as desired. If a script can be deferred, it can also be moved to the bottom of the page. That will make your web pages load faster.

Discuss Rule 6

7: Avoid CSS Expressions

CSS expressions are a powerful (and dangerous) way to set CSS properties dynamically. They're supported in Internet Explorer, starting with version 5. As an example, the background color could be set to alternate every hour using CSS expressions.

      background-color: expression( (new Date()).getHours()%2 ? "#B8D4FF" : "#F08A00" );

As shown here, the expression method accepts a JavaScript expression. The CSS property is set to the result of evaluating the JavaScript expression. The expression method is ignored by other browsers, so it is useful for setting properties in Internet Explorer needed to create a consistent experience across browsers.

The problem with expressions is that they are evaluated more frequently than most people expect. Not only are they evaluated when the page is rendered and resized, but also when the page is scrolled and even when the user moves the mouse over the page. Adding a counter to the CSS expression allows us to keep track of when and how often a CSS expression is evaluated. Moving the mouse around the page can easily generate more than 10,000 evaluations.

One way to reduce the number of times your CSS expression is evaluated is to use one-time expressions, where the first time the expression is evaluated it sets the style property to an explicit value, which replaces the CSS expression. If the style property must be set dynamically throughout the life of the page, using event handlers instead of CSS expressions is an alternative approach. If you must use CSS expressions, remember that they may be evaluated thousands of times and could affect the performance of your page.

Discuss Rule 7

8: Make JavaScript and CSS External

Many of these performance rules deal with how external components are managed. However, before these considerations arise you should ask a more basic question: Should JavaScript and CSS be contained in external files, or inlined in the page itself?

Using external files in the real world generally produces faster pages because the JavaScript and CSS files are cached by the browser. JavaScript and CSS that are inlined in HTML documents get downloaded every time the HTML document is requested. This reduces the number of HTTP requests that are needed, but increases the size of the HTML document. On the other hand, if the JavaScript and CSS are in external files cached by the browser, the size of the HTML document is reduced without increasing the number of HTTP requests.

The key factor, then, is the frequency with which external JavaScript and CSS components are cached relative to the number of HTML documents requested. This factor, although difficult to quantify, can be gauged using various metrics. If users on your site have multiple page views per session and many of your pages re-use the same scripts and stylesheets, there is a greater potential benefit from cached external files.

Many web sites fall in the middle of these metrics. For these properties, the best solution generally is to deploy the JavaScript and CSS as external files. The only exception I've seen where inlining is preferable is with home pages, such as Yahoo!'s front page (http://www.yahoo.com) and My Yahoo! (http://my.yahoo.com). Home pages that have few (perhaps only one) page view per session may find that inlining JavaScript and CSS results in faster end-user response times.

For front pages that are typically the first of many page views, there are techniques that leverage the reduction of HTTP requests that inlining provides, as well as the caching benefits achieved through using external files. One such technique is to inline JavaScript and CSS in the front page, but dynamically download the external files after the page has finished loading. Subsequent pages would reference the external files that should already be in the browser's cache.

Discuss Rule 8

9: Reduce DNS Lookups

The Domain Name System (DNS) maps hostnames to IP addresses, just as phonebooks map people's names to their phone numbers. When you type www.yahoo.com into your browser, a DNS resolver contacted by the browser returns that server's IP address. DNS has a cost. It typically takes 20-120 milliseconds for DNS to lookup the IP address for a given hostname. The browser can't download anything from this hostname until the DNS lookup is completed.

DNS lookups are cached for better performance. This caching can occur on a special caching server, maintained by the user's ISP or local area network, but there is also caching that occurs on the individual user's computer. The DNS information remains in the operating system's DNS cache (the "DNS Client service" on Microsoft Windows). Most browsers have their own caches, separate from the operating system's cache. As long as the browser keeps a DNS record in its own cache, it doesn't bother the operating system with a request for the record.

Internet Explorer caches DNS lookups for 30 minutes by default, as specified by the DnsCacheTimeout registry setting. Firefox caches DNS lookups for 1 minute, controlled by the network.dnsCacheExpiration configuration setting. (Fasterfox changes this to 1 hour.)

When the client's DNS cache is empty (for both the browser and the operating system), the number of DNS lookups is equal to the number of unique hostnames in the web page. This includes the hostnames used in the page's URL, images, script files, stylesheets, Flash objects, etc. Reducing the number of unique hostnames reduces the number of DNS lookups.

Reducing the number of unique hostnames has the potential to reduce the amount of parallel downloading that takes place in the page. Avoiding DNS lookups cuts response times, but reducing parallel downloads may increase response times. My guideline is to split these components across at least two but no more than four hostnames. This results in a good compromise between reducing DNS lookups and allowing a high degree of parallel downloads.

Discuss Rule 9

10: Minify JavaScript

Minification is the practice of removing unnecessary characters from code to reduce its size thereby improving load times. When code is minified all comments are removed, as well as unneeded white space characters (space, newline, and tab). In the case of JavaScript, this improves response time performance because the size of the downloaded file is reduced. Two popular tools for minifying JavaScript code are JSMin and YUI Compressor.

Obfuscation is an alternative optimization that can be applied to source code. Like minification, it removes comments and white space, but it also munges the code. As part of munging, function and variable names are converted into smaller strings making the code more compact as well as harder to read. This is typically done to make it more difficult to reverse engineer the code. But munging can help performance because it reduces the code size beyond what is achieved by minification. The tool-of-choice is less clear in the area of JavaScript obfuscation. Dojo Compressor (ShrinkSafe) is the one I've seen used the most.

Minification is a safe, fairly straightforward process. Obfuscation, on the other hand, is more complex and thus more likely to generate bugs as a result of the obfuscation step itself. Obfuscation also requires modifying your code to indicate API functions and other symbols that should not be munged. It also makes it harder to debug your code in production. Although I've never seen problems introduced from minification, I have seen bugs caused by obfuscation. In a survey of ten top U.S. web sites, minification achieved a 21% size reduction versus 25% for obfuscation. Although obfuscation has a higher size reduction, I recommend minifying JavaScript code because of the reduced risks and maintenance costs.

In addition to minifying external scripts, inlined script blocks can and should also be minified. Even if you gzip your scripts, as described in Rule 4, minifying them will still reduce the size by 5% or more. As the use and size of JavaScript increases, so will the savings gained by minifying your JavaScript code.

Discuss Rule 10

11: Avoid Redirects

Redirects are accomplished using the 301 and 302 status codes. Here's an example of the HTTP headers in a 301 response:

      HTTP/1.1 301 Moved Permanently
    Location: http://example.com/newuri
    Content-Type: text/html

The browser automatically takes the user to the URL specified in the Location field. All the information necessary for a redirect is in the headers. The body of the response is typically empty. Despite their names, neither a 301 nor a 302 response is cached in practice unless additional headers, such as Expires or Cache-Control, indicate it should be. The meta refresh tag and JavaScript are other ways to direct users to a different URL, but if you must do a redirect, the preferred technique is to use the standard 3xx HTTP status codes, primarily to ensure the back button works correctly.

The main thing to remember is that redirects slow down the user experience. Inserting a redirect between the user and the HTML document delays everything in the page since nothing in the page can be rendered and no components can start being downloaded until the HTML document has arrived.

One of the most wasteful redirects happens frequently and web developers are generally not aware of it. It occurs when a trailing slash (/) is missing from a URL that should otherwise have one. For example, going to http://astrology.yahoo.com/astrology results in a 301 response containing a redirect to http://astrology.yahoo.com/astrology/ (notice the added trailing slash). This is fixed in Apache by using Alias or mod_rewrite, or the DirectorySlash directive if you're using Apache handlers.

Connecting an old web site to a new one is another common use for redirects. Others include connecting different parts of a website and directing the user based on certain conditions (type of browser, type of user account, etc.). Using a redirect to connect two web sites is simple and requires little additional coding. Although using redirects in these situations reduces the complexity for developers, it degrades the user experience. Alternatives for this use of redirects include using Alias and mod_rewrite if the two code paths are hosted on the same server. If a domain name change is the cause of using redirects, an alternative is to create a CNAME (a DNS record that creates an alias pointing from one domain name to another) in combination with Alias or mod_rewrite.

Discuss Rule 11

12: Remove Duplicate Scripts

It hurts performance to include the same JavaScript file twice in one page. This isn't as unusual as you might think. A review of the ten top U.S. web sites shows that two of them contain a duplicated script. Two main factors increase the odds of a script being duplicated in a single web page: team size and number of scripts. When it does happen, duplicate scripts hurt performance by creating unnecessary HTTP requests and wasted JavaScript execution.

Unnecessary HTTP requests happen in Internet Explorer, but not in Firefox. In Internet Explorer, if an external script is included twice and is not cacheable, it generates two HTTP requests during page loading. Even if the script is cacheable, extra HTTP requests occur when the user reloads the page.

In addition to generating wasteful HTTP requests, time is wasted evaluating the script multiple times. This redundant JavaScript execution happens in both Firefox and Internet Explorer, regardless of whether the script is cacheable.

One way to avoid accidentally including the same script twice is to implement a script management module in your templating system. The typical way to include a script is to use the SCRIPT tag in your HTML page.

      script type="text/javascript" src="menu_1.0.17.js">

An alternative in PHP would be to create a function called insertScript.

In addition to preventing the same script from being inserted multiple times, this function could handle other issues with scripts, such as dependency checking and adding version numbers to script filenames to support far future Expires headers.

Discuss Rule 12

13: Configure ETags

Entity tags (ETags) are a mechanism that web servers and browsers use to determine whether the component in the browser's cache matches the one on the origin server. (An "entity" is another word for what I've been calling a "component": images, scripts, stylesheets, etc.) ETags were added to provide a mechanism for validating entities that is more flexible than the last-modified date. An ETag is a string that uniquely identifies a specific version of a component. The only format constraints are that the string be quoted. The origin server specifies the component's ETag using the ETag response header.

      HTTP/1.1 200 OK
    Last-Modified: Tue, 12 Dec 2006 03:03:59 GMT
    ETag: "10c24bc-4ab-457e1c1f"
    Content-Length: 12195

Later, if the browser has to validate a component, it uses the If-None-Match header to pass the ETag back to the origin server. If the ETags match, a 304 status code is returned reducing the response by 12195 bytes for this example.

      GET /i/yahoo.gif HTTP/1.1
    Host: us.yimg.com
    If-Modified-Since: Tue, 12 Dec 2006 03:03:59 GMT
    If-None-Match: "10c24bc-4ab-457e1c1f"
    HTTP/1.1 304 Not Modified

The problem with ETags is that they typically are constructed using attributes that make them unique to a specific server hosting a site. ETags won't match when a browser gets the original component from one server and later tries to validate that component on a different server, a situation that is all too common on Web sites that use a cluster of servers to handle requests. By default, both Apache and IIS embed data in the ETag that dramatically reduces the odds of the validity test succeeding on web sites with multiple servers.

The ETag format for Apache 1.3 and 2.x is inode-size-timestamp. Although a given file may reside in the same directory across multiple servers, and have the same file size, permissions, timestamp, etc., its inode is different from one server to the next.

IIS 5.0 and 6.0 have a similar issue with ETags. The format for ETags on IIS is Filetimestamp:ChangeNumber. A ChangeNumber is a counter used to track configuration changes to IIS. It's unlikely that the ChangeNumber is the same across all IIS servers behind a web site.

The end result is ETags generated by Apache and IIS for the exact same component won't match from one server to another. If the ETags don't match, the user doesn't receive the small, fast 304 response that ETags were designed for; instead, they'll get a normal 200 response along with all the data for the component. If you host your web site on just one server, this isn't a problem. But if you have multiple servers hosting your web site, and you're using Apache or IIS with the default ETag configuration, your users are getting slower pages, your servers have a higher load, you're consuming greater bandwidth, and proxies aren't caching your content efficiently. Even if your components have a far future Expires header, a conditional GET request is still made whenever the user hits Reload or Refresh.

If you're not taking advantage of the flexible validation model that ETags provide, it's better to just remove the ETag altogether. The Last-Modified header validates based on the component's timestamp. And removing the ETag reduces the size of the HTTP headers in both the response and subsequent requests. This Microsoft Support article describes how to remove ETags. In Apache, this is done by simply adding the following line to your Apache configuration file:

      FileETag none

Discuss Rule 13

14: Make Ajax Cacheable

People ask whether these performance rules apply to Web 2.0 applications. They definitely do! This rule is the first rule that resulted from working with Web 2.0 applications at Yahoo!.

One of the cited benefits of Ajax is that it provides instantaneous feedback to the user because it requests information asynchronously from the backend web server. However, using Ajax is no guarantee that the user won't be twiddling his thumbs waiting for those asynchronous JavaScript and XML responses to return. In many applications, whether or not the user is kept waiting depends on how Ajax is used. For example, in a web-based email client the user will be kept waiting for the results of an Ajax request to find all the email messages that match their search criteria. It's important to remember that "asynchronous" does not imply "instantaneous".

To improve performance, it's important to optimize these Ajax responses. The most important way to improve the performance of Ajax is to make the responses cacheable, as discussed in Rule 3: Add an Expires Header. Some of the other rules also apply to Ajax:

However, Rule 3 is the most important for speeding up the user experience. Let's look at an example. A Web 2.0 email client might use Ajax to download the user's address book for autocompletion. If the user hasn't modified her address book since the last time she used the email web app, the previous address book response could be read from cache if that Ajax response was made cacheable with a future Expires header. The browser must be informed when to use a previously cached address book response versus requesting a new one. This could be done by adding a timestamp to the address book Ajax URL indicating the last time the user modified her address book, for example, &t=1190241612. If the address book hasn't been modified since the last download, the timestamp will be the same and the address book will be read from the browser's cache eliminating an extra HTTP roundtrip. If the user has modified her address book, the timestamp ensures the new URL doesn't match the cached response, and the browser will request the updated address book entries.

Even though your Ajax responses are created dynamically, and might only be applicable to a single user, they can still be cached. Doing so will make your Web 2.0 apps faster.

Discuss Rule 14

Site Search

All the News From World

Tuesday, January 22, 2008

Friday, January 11, 2008

Sitemaps XML format

Sample XML Sitemap

XML tag definitions

Entity escaping

Sample XML Sitemap

Using Sitemap index files (to group multiple sitemap files)

Sample XML Sitemap Index

Sitemap Index XML Tag Definitions

Other Sitemap formats

Syndication feed

Text file

Sitemap file location

Validating your Sitemap

Extending the Sitemaps protocol

Informing search engine crawlers

Submitting your Sitemap via the search engine's submission interface

Specifying the Sitemap location in your robots.txt file

Submitting your Sitemap via an HTTP request

Thursday, January 10, 2008

Wednesday, January 9, 2008

Monday, January 7, 2008

Friday, January 4, 2008

Thursday, January 3, 2008

Best Practices for Speeding Up Your Web Site

High Performance Web Sites: The Importance of Front-End Performance

1: Minimize HTTP Requests

2: Use a Content Delivery Network

3: Add an Expires Header

4: Gzip Components

5: Put Stylesheets at the Top

6: Put Scripts at the Bottom

7: Avoid CSS Expressions

8: Make JavaScript and CSS External

9: Reduce DNS Lookups

10: Minify JavaScript

11: Avoid Redirects

12: Remove Duplicate Scripts

13: Configure ETags

14: Make Ajax Cacheable

All News From World Wide Web

Google Hot Trends

Moneycontrol Top Headlines

Google News India

Site Search

Google Operating System

- Google News

Blog Archive

Labels

Links to See