171 ТЕКСТЫ ДЛЯ РЕФЕРИРОВАНИЯ ПО ВЫЧИСЛИТЕЛЬНОЙ ТЕХНИКЕ - Страница 2

"What we've done is narrow the gap between standard TCP/IP communications that everybody loves and knows how to use and have the

- 5 -

software to use and these more cutting edge technologies that are harder to use and difficult for people to program," he said.

TCP/IP communications software operates in software layers called a "protocol stack" inside individual LAN computers. That protocol software works in coordination with other software in sending and receiving data across the network connecting each computer.

"The sending and receiving software must be in synch to make sure that they are carrying as many bits as they can carry," Chase added. "That software has to run very efficiently or else the computers won't be able to keep up."

What the Duke team did is "streamline" software operations on both the sending and receiving sides of the central protocol stacks through a variety of modifications.

One change, called "zero-copy data movement," circumvents the time-consuming step of reading data from one area of computer network memory and writing it into another, which taxes a computer's central processing unit (CPU).

"One might think we could fix this problem by using faster CPUs," Chase added. "As it turns out, memory speeds are not growing as fast as CPU speed. As CPU speed increases relative to memory speed, your fast CPU spends a larger share of its processing power waiting for the slow memory to respond to these copy operations."

A related feature, called "scatter/gather input/output," allows data in various locations of computer memory to be rounded up and sent together as large messages. A third, called "checksum offloading," enables computers to use special hardware on their network cards to speed up error checking.

Another innovation, for which the Duke group has filed for a patent, is "adaptive message pipelining," which schedules the movement of data between the network and an individual computer's memory to deliver high performance.

Some of these changes involved modifying software codes. Others involved changing "firmware," codes in network cards that programmers ordinarily cannot alter. By special agreement, Myricom provided the Duke researchers the tools to alter the firmware.

Major components of the network system that the Duke team did not alter are the protocol stacks themselves, obtained from a standard Unix operating system public domain TCP/IP source called FreeBSD.

"A lot of very smart people at a lot of places over a period of decades have done a lot of work trying to write the software that allows TCP

- 6-

communications at very high speeds," Chase said. "In some sense, what we have done really is show that they got it right."

 

Text 3. WORD SCANS INDICATE NEW WAYS OF SEARCHING THE WEB

International Online Conference on Computer Science

In the years after the American Revolution, U.S. presidents were talking about the British a lot, and then about militias, France and Spain. In the mid-19th century, words like "emancipation," "slaves" and "rebellion" popped up in their speeches. In the early 20th century, presidents started using a lot of business-expansion words, soon to be replaced by "depression."

A couple of decades later they spoke of atoms and communism. By the 1990s, buzzwords prevailed.

Jon Kleinberg, a professor of computer science at Cornell University, Ithaca, N.Y., has developed a method for a computer to find the topics that dominate a discussion at a particular time by scanning large collections of documents for sudden, rapid bursts of words. Among other tests of the method, he scanned presidential State of the Union addresses from 1790 to the present and created a list of words that easily reflects historical trends. The technique, he suggests, could have many "data mining" applications, including searching the Web or studying trends in society as reflected in Web pages.

Kleinberg will emphasize the Web applications of his searching technique in a talk, "Web Structure and the Design of Search Algorithms," at the annual meeting of the American Association for the Advancement of Science (AAAS) in Denver on Feb. 18. He is taking part in a symposium on "Modeling the Internet and the World Wide Web".

Kleinberg says he got the idea of searching over time while trying to deal with his own flood of incoming e-mail. He reasoned that when an important topic comes up for discussion, keywords related to the topic will show a sudden increase in frequency. A search for these words that suddenly appear more often might, he theorized, provide ways to categorize messages.

He devised a search algorithm that looks for "burstiness," measuring not just the number of times words appear, but the rate of increase in those numbers over time. Programs based on his algorithm can scan text that varies with time and flag the most "bursty" words. "The method is motivated by probability models used to analyze the behavior of

 

- 7 -

communication networks, where burstiness occurs in the traffic due to congestion and hot spots," he explains.

In his own e-mail -- largely from other computer scientists -- he quickly found keywords relating to hot topics. In mail from students he found bursts in the word "prelim" shortly before each midterm exam. Later, he tried the same technique on the texts of State of the Union addresses, all of which are available on the Web, from Washington in 1790 through George W. Bush in 2002. From these speeches he produced a long list of words (see attached table) that summarizes American politics from early revolutionary fervor up to the age of the modern speechwriter.

While we already know about these trends in American history, Kleinberg points out, a computer doesn't, and it has found these ideas just by scanning raw text. So such a technique should work just as well on historical records in obscure situations where we have no idea what the important terms or keywords are. It might even be used to screen e-mail "chatter" by terrorists. Sociologists, Kleinberg adds, may find it interesting to look for trends in personal Web logs popularly known as "blogs."

For searching the Web, Kleinberg suggests, such a technique could help zero in on what a searcher wants by recognizing the time context of such material as news stories. For instance, he says, a person searching for the word "sniper" today is likely to be looking for information about the recent attacks around the nation's capital -- but the same search nearly four decades ago might have come from someone interested in the Kennedy assassination.

In his AAAS talk Kleinberg also explores other Web-searching techniques. A few years ago, he suggested that a way to find the most useful Web sites on a particular subject would be to look at the way they are linked to one another. Sites that are "linked to" by many others are probably "authorities." Sites that link to many others are likely to be "hubs." The most authoritative sites on a topic would be the ones that are linked to most often by the most active hubs, he reasoned. A variation on this idea is used by Google, and a more formal version is being used in a new search engine called Teoma.

http://www.teoma.com .

Kleinberg and others have found that despite its anarchy, there is a great deal of "self-organization" on the Web. In a variation on the "six degrees of separation" idea, Kleinberg says, almost every site on the Web can be reached from almost any other through a series of steps. The structure seems to be a bit like the Milky Way galaxy, with a very dense "core" of heavily interconnected sites surrounded by less dense regions. Nodes

- 8 -

outside the core are divided into three categories: "upstream" nodes that link to the core but cannot be reached from it; "downstream" nodes that can be reached from the core but don't link back to it; and isolated "tendrils" that are not linked directly to the core at all.

Within this structure there are many "communities" of sites representing common interests that are extensively linked to one another. So, Kleinberg suggests, searches might be done by following along the link paths from one site to another, as well as just scanning an index of everything.

"Deeper analysis, exposing the structure of communities embedded in the Web, raises the prospect of bringing together individuals with common interests and lowering barriers to communication," Kleinberg concludes.

 

Text 4. SAPPHIRE/SLAMMER WORM SHATTERS PREVIOUS SPEED RECORDS FOR SPREADING THROUGH THE INTERNET

International Online Conference on Computer Science

A team of network security experts in California has determined that the computer worm that attacked and hobbled the global Internet 11 days ago was the fastest computer worm ever recorded. In a technical paper released today, the experts report that the speed and nature of the Sapphire worm (also called Slammer) represent significant and worrisome milestones in the evolution of computer worms.

Computer scientists at the University of California, San Diego and its San Diego Supercomputer Center (SDSC), Eureka-based Silicon Defense, the University of California, Berkeley, and the nonprofit International Computer Science Institute in Berkeley, found that the Sapphire worm doubled its numbers every 8.5 seconds during the explosive first minute of its attack. Within 10 minutes of debuting at 5:30 a.m. (UTC) Jan. 25 (9:30 p.m. PST, Jan. 24) the worm was observed to have infected more than 75,000 vulnerable hosts. Thousands of other hosts may also have been infected worldwide. The infected hosts spewed billions of copies of the worm into cyberspace, significantly slowing Internet traffic, and interfering with many business services that rely on the Internet.

"The Sapphire/Slammer worm represents a major new threat in computer worm technology, demonstrating that lightning-fast computer worms are not just a theoretical threat, but a reality," said Stuart Staniford, president and founder of Silicon Defense. "Although this particular computer worm did not carry a malicious payload, it did a lot of harm by spreading so aggressively and blocking networks."

The Sapphire worm's software instructions, at 376 bytes, are about the length of the text in this paragraph, or only one-tenth the size of the Code

- 9 -

Red worm, which spread through the Internet in July 2001. Sapphire's tiny size enabled it to reproduce rapidly and also fit into a type of network "packet" that was sent one-way to potential victims, an aggressive approach designed to infect all vulnerable machines rapidly and saturate the Internet's bandwidth, the experts said. In comparison, the Code Red worm spread much more slowly not only because it took longer to replicate, but also because infected machines sent a different type of message to potential victims that required them to wait for responses before subsequently attacking other vulnerable machines.

The Code Red worm ended up infecting 359,000 hosts, in contrast to the approximately 75,000 machines that Sapphire hit. However, Code Red took about 12 hours to do most of its dirty work, a snail's pace compared with the speedy Sapphire.

The Code Red worm sent six copies of itself from each infected machine every second, in effect "scanning" the Internet randomly for vulnerable machines. In contrast, the speed with which the diminutive Sapphire worm copied itself and scanned the Internet for additional vulnerable hosts was limited only by the capacity of individual network connections.

"For example, the Sapphire worm infecting a computer with a one-megabit-per-second connection is capable of sending out 300 copies of itself each second," said Staniford. A single computer with a 100-megabit-per-second connection, found at many universities and large corporations, would allow the worm to scan 30,000 machines per second.

"The novel feature of this worm, compared to all the other worms we've studied, is its incredible speed: it flooded the Internet with copies of itself so aggressively that it basically clogged the available bandwidth and interfered with its own growth," said David Moore, an Internet researcher at SDSC's Cooperative Association for Internet Data Analysis (CAIDA) and a Ph.D. candidate at UCSD under the direction of Stefan Savage, an assistant professor in the Department of Computer Science and Engineering. "Although our colleagues at Silicon Defense and UC Berkeley had predicted the possibility of such high-speed worms on theoretical grounds, Sapphire is the first such incredibly fast worm to be released by computer hackers into the wild," said Moore.

Sapphire exploited a known vulnerability in Microsoft SQL servers used for database management, and MSDE 2000, a mini version of SQL for desktop use. Although Microsoft had made a patch available, many machines did not have the patch installed when Sapphire struck. Fortunately, even the successfully attacked machines were only temporarily out of service.

- 10 -

"Sapphire's greatest harm was caused by collateral damage--a denial of legitimate service by taking database servers out of operation and overloading networks," said Colleen Shannon, a CAIDA researcher. "At Sapphire's peak, it was scanning 55 million hosts per second, causing a computer version of freeway gridlock when all the available lanes are bumper-to-bumper." Many operators of infected computers shut down their machines, disconnected them from the Internet, installed the Microsoft patch, and turned them back on with few, if any, ill effects.

The team in California investigating the attack relied on data gathered by an array of Internet "telescopes" strategically placed at network junctions around the globe. These devices sampled billions of information-containing "packets" analogous to the way telescopes gather photons.

With the Internet telescopes, the team found that nearly 43 percent of the machines that became infected are located in the United States, almost 12 percent are in South Korea, and more than 6 percent are in China.



 
страниц Позволяет легко. страниц до активной если значение