Workshops during the General Online Research 05
1.) title of the workshop:
Hidden data collection and data mining on the Internet
Dr. Dietmar Janetzko (University of Freiburg)
2.) duration of the workshop:
2 X 90 Minutes
3.) workshop fees:
50.- EUR (78,- CHF)/ 25,- EUR (39,- CHF)
4.) target groups:
The workshops addresses people from both industry (market research, opinion research marketing) and academia (psychology, economics, political science, sociology, pedagogics)
5.) Is the workshop prepared for an exclusively
German language or an international audience?
The materials distributed among participants of the workshop (booklet, CD-ROM) will be in English.
6.) workshop language:
Depending on language preferences of the participants the workshop language be either English or German.
7.) Description of the content of the workshop:
Non-reactive collection is an umbrella term that refers to all kinds of methods used for hidden data collection. Subjects who do not know that their behavior is investigated or observed, are often expected not to react to this analysis (e.g., by impression management). Thus, these techniques are deployed on the assumption that subjects investigated will then exhibit their "true" behavior. Non-reactive data collection on the Internet occupies a firm position in all post-modern conspiracy theories related to secret services and companies that act as global players. But like any other type of data collection, non-reaction data collection is limited and and will only elicit a tiny fraction of the data that could be collected about a person. However, once combined with other sources of information that may or may not be freely available on the market, data collected in a non-reactive fashion will reveal insights into persons that severely affect privacy of Internet users and that can be employed for economical purposes. A well-known case in point is spamming, when non-directive data-collection is used to draw a distinction between hot addresses and cold addresses. While the public attention is usually narrowed down to cookies, there is already a large and quite diverse array of techniques and tool in use that can be put to practice alternatively or incrementally like, e.g., covert time measurement, Internet monitoring software, session-IDs/non persistent cookies, logfiles, web Beacons/Web Bugs/Clear Gifs or Spyware. Apart from combining data from different sources there is the overall tendency to use data mining techniques to improve the signal-noise ratio in the data collected.
Non-reactive data collection has many different aspects that range from purely technical and nerd-enjoying gadgets over economical strategies to privacy. For this reason we focus on the topics mentioned below:
A) Methodology and Technologies of Non-Reactive
Data Collection
B) Combining Mass Data and Publicly Available
Data
C) Legal Aspects of Non -Reaktive Data Collection
D) Technical Devices to Prevent Non-Reactive Data
Collection
In the workshop, knowledge about non-reactive
data collection is communicated via talks, software-demonstrations
and exercises. For this reason, it is recommended
that every participant brings his or her own notebook
to the workshop. However, we will try to find
a solution for those whishing to attend without
an notebook. Participants of the workshop will
be provided with a booklet and a CD-ROM that covers
an up-and-running collection of software tools
to be used for non-reactive data collection. This
will be made possible by using an Apache-based
client-server architecture via CD-ROM that can
be used without any installation on the side of
the participants.
8.) goals of the workshop:
Participants will learn the methodology and the
techniques used for non-reactive data elicitation
and collect hand-on experience in using them.
Ethical and legal aspects of non-reactive data
collection will also be introduced.
9.) necessary previous knowledge:
General Internet literacy is a must, knowledge
about HTML is highly recommended and knowing the
basics of JavaScript would be nice.
10.) literature that has to be read for participation:
11.) additional literature
12.) information about the workshop organizer:
Dr. Dietmar Janetzko and Roman Kennke are working
at the Institute of Computer Science and Social
Research at the University of Freiburg/Germany.
Topics of their work are methods of online research,
knowledge management and data mining.