question everything: Featured

Showing posts with label Featured. Show all posts

Wednesday, March 9, 2011

Women and/or technical skills

Many people believe that women are on average less skilled in technical matters than men. Let me tell you a secret about that prejudice:

It's true!

Why yes! It totally is!

The only problem is that people frequently get the interpretation wrong of what that means.

Because you know, it's all about that nasty conditional probability thing once again. Bayes' theorem. Why "the safest way to fly is to take a bomb with you, because it's damn unlikely that there are two bombs on the same plane" doesn't work. Something closely related to and even less understood than the Monty Hall problem, which is why it has deserved to be also demonstrated with three doors.

Well, let me try to enlighten you with some very basic mathematical insights.

Starting directly with that problem would make the discussion rather abstract and theoretical and you would end up not believing me. Let's therefore stick with the lipstick example for starters. Or, since wearing lipstick is a rather binary decision -- you do or you don't --, let's generalize it to wearing make-up. Also, let's restrict it to typical western countries.

Have a look at the following diagram:

(The x-axis gives the amount of make-up usage, and the y-axis gives the number of people for each position on the x-axis.)

There is a small but not insignificant amount of women who would rather die than wear make-up. There is a small but not insignificant amount of men who plaster their faces with make-up as if it could fall apart without. But on average, women still do wear more make-up than men.

So given two random persons about whom we have no information other than one being female and the other male, which person is more likely to be wearing make-up? Answer: Quite clearly the female one.

Let's further assume that the more make-up people wear, the more they know about make-up. Which of the two persons above is therefore more likely to know a lot about make-up? Answer: Again the female one.

Let's take a third person into consideration now. We have no information about that new person other than that he or she likes wearing tons of make-up. Which of the three persons is most likely to know a lot about make-up now? Answer: Obviously the third one. Independently of whether he or she is male or female. Why? Because while we assume that the male person is probably in the lower half of the range and the female person is probably in the upper half, we know the third person to be in the very top few percent and thus, on average, better than either of the other two.

Now that we are already busy introducing new persons, let's introduce two more of them. Person D is female and wears tons of make-up. Person E is male and wears tons of make-up. Who is more likely to have a lot of knowledge about make-up now, person D or person E? Answer: Well, at this point it's pretty much 50:50.

Why is that?

Let's have another look at that diagram. We know that both persons wear [too] much make-up, which means we are only considering persons in the very right part of the diagram any more:

What can we tell about that small subset of the population?

They all know a lot about make-up. (Keep in mind that make-up usage and thus (by our assumption) knowledge is measured by the position on the left-right axis. All persons in question are on the very right edge of that graph.)
There are more females than males in that range. (I.e., there are more females than males who really know a lot about make-up.)
A male and a female person being both within that range are pretty much equally likely to win a make-up knowledge quiz against each other. In this particular graph, it's even slightly more likely for the male, i.e. person E, to know more about make-up. Why? We know about person D that she is female and within the highlighted range. Amongst the females in that range, there are more towards the lower end (at the left) than at the upper end, so the average female from within this range is slightly below the center of the range. Amongst the few males in that range on the other hand, all positions are pretty much equally (un)likely, so the average male from within this range is quite exactly at the center of this range. Therefore, in this graph person E is actually an itsy bitsy tiny bit more likely to know a lot than person D. It's a damn close call though, so with sufficient approximation we can say that they are equally likely to be more knowledgeable about make-up.

(Bonus question: Remember person three from above? We only know that this person wears a lot of make-up. Is person three more likely to be male or female?)

Now let me just re-label and re-colour that graph and we'll transfer our findings to the original problem in no time.

I know that it's extremely unpopular these days to claim that women have on average less technical competence than men, but for the sake of argument, let's assume it's true:

Now let's only look at those people who are successful in technical careers. With few exceptions, people in technical careers are people with high technical skills, and vice versa. (There might be the occasional technically skilled person who still chooses to study medicine, or the occasional person who has no clue about technical matters but got hired into daddy's company anyways, but for simplicity we ignore those for now.) People with technical careers therefore mostly correspond to the right-most part of the diagram:

And now, the same conclusions apply:

All persons who are successful in technical careers have high technical competence.
There are more males than females who have high technical skills.
Males and females from within the shown range are pretty much evenly matched, i.e., given two persons of whom one is a female technician and the other a male technician, it's hard to tell which one does probably know more about technical matters.

In short: It's less likely for women to go for a technical career, but those who do are quite evenly matched with their male colleagues.

What do we learn from that?

Given an entirely random male person and an entirely random female person, it's absolutely fair to assume that the male person has more technical competence. Let's assume you are at a quiz show, are given a question about the inner workings of a car engine, have no idea about the correct answer, and are allowed to use a phone joker. You can only choose whether the person to be called is female or male, and the quiz team will then pick a random (male or female) name from a phone book and call that person. Then it's very reasonable to go for a male helper.

Let's, in contrast, assume that your car broke down with a flat tyre and you don't know how to change a tyre yourself. A second car stops, a young woman gets out and offers you to change the tyre for you. Here, "You can't do that, you're a girl." is a wrong answer. (Not to mention that it's impolite.) Why? Because she already indicated to you that she can. She gave you some extra information about herself, which means that all conclusions you drew before that aren't valid any more. Your conclusions were made at a point where you had to take the entire range into consideration. Now however this new information significantly restricted the range. Therefore, take the new information and re-evaluate.

There's a very similar scenario, and it seems to happen so frequently that I'm willing to dedicate another diagram to it:

So, assume you call the tech hotline of whatever company to get help with a technical problem. A woman picks up. Again, "No, I want to speak to a man." is the wrong reaction. The person works at the tech hotline, so you can assume that she went through job interviews and technical training. It's also very likely that most employees at that tech hotline are roughly within the same skill range. In short, whether male of female, you will get to speak with an averagely skilled technician. (Good technicians usually get better jobs than working at tech hotlines, just for your information.)

This can be generalized to any job, by the way: While on average over the entire population women might be less technically skilled than men, there is virtually no difference between men and women working in the same job.

The moral of the story? There's nothing bad about prejudices. Just remain open to trashing them as soon as you get additional information about a person.

After all, while it's true that only around 20% percent of all people own a dog, there's no point in insisting that someone most likely doesn't have one once you've seen them taking it for a walk.

-- Birgit

P.S.: I admit to having shamelessly simplified away a lot of potentially important facts and factors.

Friday, February 18, 2011

A cheer to freeware

(Zur deutschen Version.)

Every time I re-setup my computer I realize that more and more freeware is running on it. In former times, setting up a computer meant inserting 20 CD-ROMs one after another -- MS Windows, MS Word, MS Office, Paint Shop Pro, ... --, today setting up means to me: First installing Windows, then downloading the most recent versions of all other programs.

Therefore, a cheer to freeware -- which by the way is according to certain sources in the USA a very communist construct ;) --, thanks to which I by now use hardly any proprietary software any more except for Windows.

Here's a list of great freeware programs (or in some cases shareware or demo versions) that are usually installed on my computers:

Basics:

Acrobat Reader / Foxit Reader: Reading .pdf files
GhostView / GhostScript: Reading .ps files
PDF24 Creator: Creating and editing of .pdf files
pdf995: Creating .pdf files by a printer driver
TortoiseSVN: Version control with SVN
CDBurnerXP: Burning CDs and DVDs
7-Zip: File compression program for (almost) all formats
cygwin: Linux emulator
DOSBox: DOS emulator

Creating and editing of documents:

nodepad++: Text editor and source code editor
LibreOffice: Office programs: Text editing, spreadsheet processing, presentations, ...
MikTeX: Compilation of LaTeX documents
WinShell: LaTeX editor
Asymptote: Programming language and compiler for creation of vector graphics
GeoGebra / Euklid DynaGeo: Creation and editing of interactive geometry sketches

Programming:

eclipse: Development environment
Java JDK: Java (development kit and Virtual Machine)
Visual C++ Express: C++ (development environment and compiler)
Python: Python
SWI Prolog: Prolog (development environment and compiler)

Graphics:

IrfanView: Display of image files
Gimp: Image editing
Paint.net: Image editing
Inkscape: Editor for vector graphics
autostitch: Assembling large images from multiple photos (image stitching)

Music and multimedia:

iTunes: Music playback, download and playback of podcasts, managing of files on an iPod
VLC Media Player: Playback of videos and DVDs
Winamp: Music playback
Amarok: Music playback
VirtualDub: Video recording and editing
NoteWorthy Composer (Demo): Creation of sheets of music

Internet:

Firefox: Browser
Chrome: Browser
Thunderbird: Email and newsgroup client
IMAPSize: Backup of IMAP email accounts
PuTTY: SSH and Telnet client
WinSCP: FTP and SFTP client with GUI
pidgin / qip: Instant messenger (for ICQ, AIM, ...)
ChatZilla: IRC client
Skype: Skype client (for internet telephony)
Apache: Webserver (for local testing of homepages)
Vuze: client for peer-to-peer filesharing

Antivirus:

Avira AntiVir: Anti-virus software
Spybot: Anti-spyware program

Datenbanken:

MySQL: MySQL data base system
NaviCat Lite: GUI for MySQL

lG Birgit

Edit (2011-02-18): Add Foxit Reader and PDF24 Creator.

Edit (2011-02-24): Update OpenOffice.org to LibreOffice.

Monday, January 31, 2011

Life -- A review

Life is a game, some people say. Well, if it's a game, then it has deserved a review.

I shall assume that most people are familiar with the basic rules, so I'll skip directly to the discussion about some concepts, some strategies, and replay value.

Some flawed concept

There are various concepts that are frowned upon in games, and for a reason. Unfortunately, life sports quite a few of them.

One big factor is randomness. There are various game elements that appear to be completely random (in spite of actually following certain hidden rules), and, worse than that, unnecessarily random, and not adding much to the game play.

Let's start with random starting positions. While random setup helps to keep game play varied, in this case the random starting positions just have too much influence on the kind of options a player has and their winning chances. Some starting positions are just too strong and others too weak. A player starting in certain regions of Africa for example has low chances of going for an academic career, and almost no chance of getting far ahead on the money track. Also, in many cases, the starting position already more or less determines the entire strategy and outlook of a player, and sometimes leaves very little choice. The most egregious cases here are of course those where a player is forced into a certain role and possibly even killed before even getting old enough to take own decisions, let alone reach age of consent. Child soldiers, child pornography and child prostitution are the worst (and unfortunately not exactly rare) cases here. But even those aside, for example players starting in conflict regions or very religious societies usually have very low chances of completely staying out of those things, even after reaching adulthood.

Closely related to that is a certain running leader problem, especially in terms of power and money. Players who are at some point ahead in one of these fields usually will gather more and more, while players starting low have almost no chance of catching up.

Similarly, random player attributes have an unduly large influence. The most prominent example here is gender, which is randomly assigned to players when they start playing, and even though there are some special rules that will allow changing it later, they have so many drawbacks that few people will actually use them. Closely related to that is sexual orientation, though that is one random factor that I personally find quite interesting, because it gives some players a kind of side quest without completely spoiling the game experience for them if they don't complete it. Still, depending on the starting position it might be too much of a disadvantage in some situations, especially in those where public display of queer orientation is threatened with severe social consequences and even death penalty.

Another completely unbalanced random player attribute is disability, both physical and mental. Physical disability such as blindness or dwarfism can have a huge impact on the options available to a player, like for example career choices. As far as mental disabilities are concerned, I wouldn't dare evaluating how much they really influence the player's game experience for the better or the worse, but I certainly find the concept unnecessary.

One more very random element is love, a game mechanism that more or less randomly adds strong emotional connections between players. Worse than that, these connections don't necessarily go both ways; In fact, they surprisingly often don't. Which wouldn't be so bad, if one-way love connections didn't have such detrimental effects on affected players. Reciprocal connections on the other hand usually offer a very big boost to both involved players. Not surprisingly, many players therefore consider this concept one of the best things about the game, even though it reduces predictability. But given how random everything is already, a little bit of extra randomness is probably a small price to pay.

The one concept that truly sucks however and makes playing utterly unpleasant at times is player elimination. Interestingly though, in contrast to other games with player elimination that I have played, the big problem with player elimination in life is not that the eliminated players sitting out are bored -- at least not that I'm aware of --, but rather the negative effect that the elimination of a player has on the players remaining in the game.

Discussion of Strategies

As with many complex games, the best strategies are often very hard to figure out. It's already quite complicated to find the reasonably good ones.

   Pure strategies

As far as life is concerned, many players seem to go for money as their main winning strategy. While that strategy by itself is not horribly bad, I personally have the feeling that those players are missing quite a bit of the game play. Also, looking at the outcomes of players who go for that strategy, it seems to stagnate at some point rather than going infinitely upwards, and can even genuinely backfire in some cases.

Another popular but in my humble opinion very much overrated strategy are drugs. They might give a very short immediate boost, but on the long run the strategy always shows an overall downhill trend. So far I haven't met a single player for whom the strategy ever worked out.

Love is a third strategy that is used often, with the player concentrating all efforts on finding and retaining a permanent love relationship. Given how random the concept of love is (as described above), this appears an ill-conceived idea, and while it goes well often enough, it indeed also often goes horribly and painfully wrong, to the point of players dedicating more resources than reasonably justifiable or even comprehensible to gaining, preserving or regaining love. The possible end results are stalking, pathological jealousy, erotomania, suicide, and a whole branch of psychotherapists making a living of it.

Escapism is the fourth of the big strategies, usually achieved through excessive computer gaming and/or submersion in imaginary worlds. In combination with the drug strategy it can lead to long phases of almost complete loss of connection with the outside world. In combination with the love strategy it is expressed through excessive intake of mediocre literature such as Rosamunde Pilcher and soap operas, combined with the firm belief in their realism. It can also lead to sickly sweet plans of marrying in white and living happily ever after, usually without ever accomplishing them. While many players go through a phase of escapism during their transition from childhood to adulthood, continuing the strategy beyond that age can lead to drastically reduced chances of actual accomplishment in the game. Disturbingly, the strategy is just good enough that most players who get stuck in it at some point decide to actively try to perpetuate it. And indeed abandoning the strategy at a later stage often comes with rather severe consequences -- such as realizing how many years of their life they already wasted with it --, which ironically makes continuing the strategy a very tempting choice.

Closely related is the strategy of complete submersion in work, with the only exception that instead of imaginary worlds and problems, real ones are used. The most important positive factor here is constant positive feedback from the immediate environment. The big drawback is that the strategy requires a lot of time and energy and leaves little space for change or even choice. Again, the strategy is just good enough that many players choose to stick with it, even though it stagnates soon and has little chance of achieving high results. On the upside, it has also a low risk of complete failure and can thus be considered a rather safe strategy.

A strategy that is almost diametrically opposite to the previous strategy is trying to enjoy life as it is without putting any effort into changing anything about it, or even into maintaining the status quo. As mentioned in the discussion about random starting position above, some strategies are not available everywhere, and this one can only be used effectively in regions with a strong social support system, i.e., Europe. In those situations, the main drawback of the strategy really is that it is frowned upon by other players (because it drains their resources) and might therefore lead to rather adverse reactions by them. Also, it gives very little positive feedback or direct reward. For those two reasons, the strategy is only usable for players who have a thick skin amongst their random attributes, as well as a rather strong self-esteem that doesn't need positive feedback for retention. For players who have those attributes and live in a suitable region, the strategy can work out very well, though.

Diametrically opposite to that (and therefore again close to the work strategy) is the constant strive for approval, respect and/or attention from the environment. The "disease to please" is one form, another one is the "disease to impress". Different as those two might seem, what they and all related strategies have in common is that they heavily depend on positive feedback from the social group. While those strategies work well as long as this positive feedback can be achieved, they drain quite some energy, and on the long run heavily erode self-esteem. Given that many of the other strategies presented here require self-esteem to work properly, it's often hard to get away from this strategy after having used it for some time. But also using this strategy for a longer time has many disadvantages, because it needs a constant increase in intensity to work. This is mostly due to the environment getting used to the player's behaviour and requiring higher dosages of it in order to still give the same amount of positive feedback. For example, people might compliment someone on being slim, but after a while they get used to it, so in order for them to still notice it, the person in question needs to become even slimmer. For someone depending on this strategy, this might well be a jump start into anorexia nervosa. Other frequently found extreme forms are codependency (also known as helper syndrome), workaholism, plastic surgery (taken to the extreme by certain celebrities), various forms of attention seeking, and of course loss of any kind of individuality.

Adventure seeking finally is a strategy that is popular especially amongst younger players, and usually pursued either by extreme sports or extensive traveling. While a good addition to any of the previous strategy, it's barely usable as a stand-alone strategy, mostly because the world at some point runs out of [reasonably safe] sports or [reasonably safe] places.

Many other popular strategies are very similar to those already described. Going for power for example is very similar to going for money, with similar advantages and drawbacks. Religion is mostly a more community oriented version of escapism. Excessive partying doubles as drug usage and attention seeking. Focus on learning is similar to the working strategy. Extreme altruism is another variant on the approval seeking strategy, and can double as working strategy (in social work) and even triple as escapism by concentrating on other persons' problems instead of the own ones. Gambling and other behavioural addictions are close to drug usage, both in effects and outlook. Spending a lot of time on the Internet can slide between escapism and approval/attention seeking, depending on the online activities. (Bloggers such as me seek approval, trolls seek attention, and youtube watchers seek distraction from their own lifes.) Intense sex seeking behaviour [outside of relationships] can either be a very cynical form of the love strategy, or a variant of approval seeking. Hopeless romanticism is a mixture of the love strategy and escapism. Strategies based on friends and family are closely related to the love strategy. And so on.

   Mixed strategies

Now that I've presented a bunch of strategies that don't work, let's look into strategies that do. As with many games, the strategies that work best are mixed strategies. Looking at all the strategies above, they all only go wrong in their extreme forms, but can give good results as long as their drawbacks can be absorbed somehow.

Take love for example. The big problem here is not that the strategy is inherently flawed, it's just that it can go down very far in unlucky circumstances. It's therefore very dangerous as a pure strategy, but can work very well if combined with another strategy to fall back to in case things go wrong. Indeed some of the strongest strategies I've seen put a very strong emphasis on love, but always have a backup strategy ready and can pull out when things go badly. In short, love can be a very valuable part of a mixed strategy, as long as a player "knows when to fold 'em". Admittedly, most of us probably need to fall hard once or twice before learning that hope is not a strategy (though hopeless romantics will argue about that).

Similar to a good finance portfolio, a good life strategy therefore contains a lot of very different aspects, which might well mean combining all of the pure strategies mentioned before. A good portfolio might for example contain some "safe" strategies, such as work or learning, and some risky but profitable ones like love or adventure seeking.

   Tactics

Reminder for the not so game theory savvy people out there: Strategy is the long term plan, tactics is the short term procedure used to carry out the strategy. Some of these tactics could also be considered smaller strategies, since they are not broad enough to be used as pure strategies, but are rather long-term for tactics.

Other than strategies, the available tactics in life are rather manageable, but often underestimated. Some examples of notable tactics in life are:

Health care. Almost obligatory part of every strategy.
Humour. Very valuable for almost all strategies (except maybe religion), but often underrated.
Self-confidence. Also valuable in virtually all strategies.
Relaxation, inner calmness. Popular especially in eastern traditions, useful in almost all strategies.
Distraction. Basically a short-term version of escapism, and as such quite useful in some situations.
Self-pity. Efficient for cushioning setbacks, but counterproductive when used excessively.
Social interaction. Almost all strategies involve a lot of interaction amongst humans and therefore benefit from interaction skills.
Empathy. Closely related, it is also valuable in virtually all strategies.
Honesty. Roughly balanced between positive and negative effects. If used consequently, the positive effects will after some time start to prevail, and can therefore be part of a long term strategy.
Arrogance, devaluation. Frequently used as a counter mechanism in strategies that drain self-esteem. Helps to create a false feeling of self-esteem by looking down onto others, but causes enough other problems on its own, especially in strategies dependent on positive feedback from other people.
Passive-aggressive behaviour. Wide spread, though not very effective.
Denial. Prevents some of the negative effects of realizing that a strategy went wrong, but also prevents or at least reduces motivation for change.
Copying. Imitating another player's strategy as a substitute for developing one for your own. Requires similar goals and player attributes in order to work at all, and is dangerous even in those cases. Can also be played as a stand-alone strategy, or, more precisely, as a tactics without a strategy to serve. Depending on whether one or more persons are copied it's either akin to following someone else home after having forgotten one's own address, or randomly following people in a shopping mall hoping to find what you are looking for, without even knowing what you are looking for.
Perfectionism. While it sometimes can improve things, it often is just a huge waste of time. (For example, it just made me spend a lot more time onto this posting than I had planned.)
Last but certainly not least: Reflection. Helps evaluating the current situation and fine tuning or modifying both strategy and tactics.

One game option that should not go unmentioned at this point is suicide, i.e. a voluntary premature opt-out of the game. I generally wouldn't consider it a good option unless you don't mind missing a significant part of the game, your current game score is already below zero and it's very likely that from that point it will only go further down, or at least stay significantly far below zero for a significantly long time. Of course, "significant" here is a very subjective measure. Suicide certainly doesn't give a positive end result, but I acknowledge that in some situations, the best thing to hope for is to end non-negative. That being said, I personally have the feeling that many people are opting for it too quickly as a fast way to reset their score to zero, rather than trying to go through the negative phase and getting back into the positive region again.

Replay value

What kind of replay options are available depends a lot on the religious system used. In some systems, for example Christianity and Atheism, there is basically no replay possible. Christianity offers a kind of follow-up game though that I haven't tested yet.

Other systems offer replay, either with or without preservation of information. While some players fancy the idea of starting over new while keeping all the information gathered in the first play, I personally think that making up your strategy as you go along is part of the fun of the game.

The alternative is starting over without the gathered information (or simply in an environment where the gathered information is useless, which can be achieved quite effectively by random starting positions and values). The second game would therefore start basically with the same preconditions as the first. While I wouldn't mind another round under these conditions, I don't really see the point.

Rather interesting however is the karma variant, in which the starting position in the next game depends to some extent on the performance in the previous one, making it somewhat similar to card games like Career Poker.

Verdict

Life is an extremely complex game, and figuring out good strategies is in fact one of the most interesting things about it. As with many other games, how much fun you have playing it depends a lot on how you play it and how well you play. Let's face it, Chess is no fun either when you have no idea what's going on, and/or when you throw a tantrum about every lost piece.

Still, there is a lot of randomness and a lot of unbalanced concepts. Even things that follow rules often appear to happen randomly. Don't get me wrong, randomness can very well keep a game interesting, but in this case it's just too much for my taste, especially because there is no effective mechanism to level it out. Also the imbalance between randomly assigned player attributes is just gross.

I very much like the basic idea of the game and most of the time enjoyed playing it so far, but at this stage I'd consider it an early prototype at best.

-- Birgit

Tuesday, November 23, 2010

The Mobile Internet Provider
A tragedy in five parts

Part 1: The UMTS-USB-Stick

Dear Mobile Internet Provider,

Thanks for sending me my new UMTS-USB-Stick! Since I assume you will appreciate user feedback, may I suggest that you might want to program future versions of the stick in such a way that they can be installed without clicking away no less than 36 -- that's where I stopped counting -- "Install hardware now?" and "This has no Windows Logo test. Really install hardware now?" dialogs? (Hint: 35 is no acceptable number either.)

Best regards,
A user.

Part 2: The user manual

Dear Mobile Internet Provider,

After half an hour of tedious clicking through install messages, I finally managed to properly install the USB stick -- as a USB connection device, a USB mass storage device, another USB mass storage device, a drive, and as a CD-ROM drive. I don't know why your USB stick believes itself to be a CD-ROM drive, but I assume we all have our problems, and I sympathize.

However, I now tried to set the preferences, and since you chose to provide only a bad imitation of a human readable interface, I resorted to reading your documentation and got stuck there. In particular I'm trying to create a new connection profile. Your documentation tells me:
1. Choose "Settings > Options > Profile Management". (Well hidden, I grant you, given it's a function every single goddamn user will have to use, but findable with some clicking.)
2. Click onto "New". (No rocket science. I could have done that.)
3. Set all parameters. (Set to what, pray tell?)
4. Click "OK". (I think I would have figured that one out.)
5. Close the settings window. (That one might have been a tough nut...)

Let me give you my equally helpful suggestion on how to make this documentation somewhat more usable:
1. Start your computer.
2. Enter your username and password, then press Enter.
3. Trash that whole bullshit and rewrite it from scratch.
4. Save the files you edited.
5. Close the editor.

Best regards,
A sceptical user.

Part 3: The user credentials

Dear Mobile Internet Provider,

I am now ready to connect to the internet, and the USB stick demands a PIN. I therefore grabbed the letter with the credentials that you sent me along with the "How to get online immediately" guide. The first line of the letter tells me to go to your homepage, where allegedly I will be able to log in using the credentials you sent me, and receive my PIN.

While I appreciate the Catch-22 reference, I can't help wondering about two things:
a) I need Internet access in order to get Internet access? Srsly?
b) Why do you send me top secret credentials that I can use to retrieve my top secret PIN, instead of just sending me my top secret PIN?

Best regards,
An annoyed user.

Part 4: The PIN

Dear Mobile Internet Provider,

I tried to go to your homepage to in order to log in there, using my top secret username ("my4358625sddfj") and my top secret password ("05051986"). Since I assume nobody has informed you yet, I would like to point out to you that usually the password is supposed to be a random sequence of letters and numbers, whereas the user name should be something easy to remember, such as my birthdate. Not the goddamn other way round.

I shall give you the benefit of the doubt and assume that if someone actually has told you so, you probably could not understand him at that moment because his voice was drowned out by the cries of agony and despair from your homepage user interface testers. Whole legions of usability engineers must have lost faith in humankind upon seeing this masterpiece of pure, unadulterated user hostility, taking their entire families with them in harakiri to spare them life in such a world.

I eventually found the correct login sequence by releasing a bunch of test monkeys onto your page, programmed to randomly click links, enter the provided credentials and scan returned pages for four digit numbers near the phrase "PIN". Shortly after two of the test monkeys spontaneously evolved to sentient AIs (and promptly committed suicide to cleanse themselves from having touched your UI), one of the remaining ones succeeded in retrieving my PIN by accidentally clicking onto what had appeared to be a background element.

Best regards,
A frustrated user.

Part 5: The aftermath

Dear Mobile Internet Provider,

It's been two weeks since my last letter. I still can't connect to the Internet. While my other UMTS-USB-Stick (which unfortunately I can't use for connecting because it is registered in an other country) finds 5 networks of varying but decent signal strength, your UMTS-USB-Stick finds exactly ... none. Not only does it fail to connect, it also fails to cancel trying to connect. Heck, it fails trying to cancel cancelling trying to connect. Interestingly, it does not fail to cancel cancelling cancelling trying to connect, but I assume that is only because it would have required your programmers to understand recursion.

If you as a company really believe in making this world a better place as your advertisement claims, then please go and find whoever programmed this, break every finger they ever used to write code with (though most likely that's only the index fingers), every finger they possibly ever could write code with (and to be on the safe side, also the toes), and for posterity's sake, geld them. I would have suggested forcing them to use their own products, but I'm against undue cruelty.

As far as I am concerned, I'd hereby like to cancel my contract with you at the first possible date.

Best regards,
A former user.

Any similarity to real companies or events is inevitable, but purely coincidental.

-- Birgit

Tuesday, October 5, 2010

Telephone surveys simply explained

Of course, this simplifies the issue, so let's honor it with a bit more detail.

(Note: I'm neither a specialist on statistics nor on telephone surveys. I'm only applying some common sense here.)

Sources of errors and biases

By and large there are two kinds of errors that can distort survey results:

Errors due to parts of the population not being reachable by phone ("selection bias").
Errors due to people not answering honestly ("measurement error").

Selection bias

There are various reasons why people can't be reached. Some don't have phones. This includes relatively isolated parts of the population, such as indigenous tribes and Amish people, but also homeless people, children, and people with alternative life styles who don't use phones by choice. Others just don't pick up when they see unknown or suppressed numbers.

Most telephone surveys deal with this by just extrapolating -- that is, hoping that those who could not be reached would have answered similar to those who did pick up. How well this approach works depends largely on how strongly the question is correlated to the reachability. Ideally, there is no correlation at all.

For example, if the question is "Do you prefer strawberry or vanilla ice cream?", then it's quite likely that this is not strongly linked to the possession of a phone. Probably the percentage of homeless who like vanilla better is similar to the percentage amongst millionaires.

Asking "Do you own a phone?", on the other hand, is akin to asking "Are you asleep?". You will not get a [credible] "no" for an answer.

Most questions are somewhere in the grey area between these extremes. For example, "Did you book a holiday on the Internet this year?" has distorted results, because of those people who did, a certain percentage is probably on this very holiday when the survey is being made.

Similarly, political opinion polls are likely to be distorted, because preference for a particular party is strongly tied to the social group -- and so is reachability by phone. A party with a program that favors young educated people for example might underperform in such surveys, because young educated people have mobile phones with caller ID, and know how to put anonymous calls onto an ignore list.

Other kinds of surveys have similar issues with selection bias. For example, a survey that is conducted in a shopping mall by randomly approaching people might not exclude people without phones, but it preselects on "people who go to shopping malls", and additionally has some preselection against assertive people who are more likely to refuse to take part in the survey.

Finally, the real selection bias nightmare are surveys where people can sign up themselves to participate. This is something that many non-profit organizations suffer from. Not having the money for a professional survey, they often send out "Please take our latest survey" emails to friends and mailing list subscribers -- which is a group that's usually very far away from the average opinion on the topic at hand. It's a bit like asking your five best friends if they like you, and then extrapolating that all the world loves you: Good for your self-esteem, but not very realistic.

Measurement error

Measurement errors are simpler to explain: Some people just don't answer honestly or correctly. Again, how much of a problem this is depends on the question. For some questions, people don't have any incentive to lie. Take for example the already mentioned vanilla or strawberry ice cream preference.

There are other questions however where people are much more likely to lie. "Do you cheat on your wife/husband?" is a classical one. But also certain political parties are generally underrated in pre-election polls because people are too embarrassed to admit that they vote for them. For example, we all know that nobody would ever vote for the FPÖ (except for those 10-20% that regularly do so at the elections, but miraculously never show up in any polls).

Besides intentional lying there are also questions where people simply don't know the correct answer. Smokers for example tend to underrate how much money they really spend on cigarettes, and few people can really tell how many hours per day they spend on the Internet or watching TV.

And sometimes people just don't understand the question. Ask enough people whether they have ever seen a phishing attack, and you will find some who have never heard of "phishing" before, hear "fishing" instead, and answer "no" because no, they have never seen a fish attack anyone.

Handling errors and biases

In order to handle these errors and biases, surveys can for example do the following things:

Estimating known biases

In some cases, previous surveys compared to real data can indicate what biases are to expect. For example, by comparing pre-election polls with election results, it's possible to see patterns how real results differ from predictions. Once it is known that the aforementioned FPÖ generally is underestimated in surveys, it is possible to estimate how much the bias distorts the result and try to calculate it away.

Weighting sample to match demographics

As mentioned before, some groups of the population are underrepresented in those samples because they are less likely to be reachable by phone than others. Surveys that also ask for age, gender and similar attributes can weight the answers so that the overall result better matches the distribution in the population. For example, if it is known that 30% of the population are older than 50 years, but of the people who took part in the survey only 10% are, then those 10% get more weight.

Stratified random selection

When proper distribution amongst certain population groups is crucial, instead of randomly calling phone numbers, the survey participants can be selected randomly per group, and if necessary even be surveyed by different means. For example, when it's important to avoid that homeless people are underrepresented -- for example, for a discount store they might be an important group of the customers --, then a survey will randomly select 200 phone numbers, and in addition perform 100 random in-person surveys at a homeless shelter. It will have other selection biases, but can avoid those that are known to be important to a particular survey.

Estimating result confidence

Even if there were no sampling biases and no measurement errors, there would still stay the problem that only a small fraction of the population was asked. So how much can asking 10 people really tell us about the average person?

Let's look at the simple example from above in detail. We want to know whether Austrians prefer strawberry or vanilla ice cream. We randomly choose 10 phone numbers and call them. 1 person likes strawberry and 9 prefer vanilla. To certain news papers this would be enough evidence for saying that 90% of the Austrians prefer vanilla ice cream. But what do we really know? The only thing we know for sure at this point is that 9 of the 8000000 Austrians like vanilla ice cream. Or, more precisely, that 9 Austrians say that they like vanilla ice cream.

The simple truth is that after calling n people, all we know for sure is how these n people answered.

Calling another n could change the whole result. The next 10 people might all like strawberry, and suddenly the preference for vanilla plummets from 90% down to 45%. And this is where probability comes in.

It is, theoretically, possible, that we have accidentally called the only 9 Austrians who like vanilla ice cream. It is possible that all the other 7999991 Austrians hate it, and that the Austrian preference for vanilla ice cream is thus at 0.0001125%. But how likely is it that with only 10 phone calls we really managed to reach these 9? Right. It's about as likely as calling 10 random phone numbers and meeting 9 attractive, single lottery millionaires, which is less likely than winning in the lottery yourself, which is less likely than being struck by lightning.

I will spare you the mathematics, but the most likely explanation for the 9 out of 10 vanilla answers is that 90% of the Austrians prefer vanilla.

Since we only called 10 persons, the result is not very reliable, though. If we call 10000 people and 9000 of them say "vanilla", we would still guess that 90% of the Austrians prefer vanilla, but we would be more confident about our estimate.

Professional surveys will therefore indicate the margin of error, which indicates how reliable the results are, by giving the range within which the real result lies with a high probability, usually 95% or 99%. And here the problems start again, because it's not possible to calculate that range without making even more assumptions. For example, do you assume that any result between 0 and 8000000 Austrians preferring vanilla is equally likely, or do you assume that roughly half of the Austrians preferring vanilla and the other half strawberry is much more likely than nobody liking strawberry?

The truth is that even the estimate can only be estimated. It can be estimated relatively well, though, and it always holds that the larger the sample size, the higher the reliability of the result. So both with 9 out of 10 and with 9000 out of 10000 answers in favor of vanilla ice cream we estimate the real result to be "around" 90%, but with different confidence levels: With 9 out of 10 answers for vanilla, the real result is with a probability of roughly 99% between 50% and 100%. With 9000 out of 10000 answers, it's with a probability of roughly 99% between 88% and 92%.

Conclusion

Surveys can have their uses, but they aren't the absolute truth and should be taken with a grain of salt. At the end of the day, the only thing they tell us for sure is how the people who were called have answered.

-- Birgit

Sunday, February 14, 2010

Happy Valentine's

Sorry for all my complaining
that you are never around
when I'm feeling bad.

It took me some time to realize
that I'm just not feeling bad
when you are around.

Wednesday, February 3, 2010

Freedom of Speech

Freedom of Speech is a value,

not a crime.

http://www.amnesty.org/en/news-and-updates/news/nine-risk-execution-over-iran-protests-20100202

http://www.euronews.net/2009/12/29/china-s-execution-laws-examined/

http://allafrica.com/stories/200806121049.html

http://www.amnesty.org/en/library/info/ASA16/033/2003/en

http://www.rnzi.com/pages/news.php?op=read&id=47546

http://en.wikipedia.org/wiki/Freedom_of_speech_by_country

http://www.amnesty.org/en/news-and-updates/news

http://www.hrw.org/

http://www.google.at/search?hl=en&q=demonstration+sentenced

http://www.google.at/search?hl=en&q=%22freedom+of+speech%22+sentenced

Birgit

Sunday, January 31, 2010

Informationssicherheit: Interne Codenamen

Eine wahre Geschichte:
(Namen geändert)

Nach einer Betriebsführung durch die Schaltzentrale eines wichtigen österreichischen Infrastruktursystems (von der Wichtigkeit etwa in der Größenordnung von Strom- oder Wasserversorgung) gibt es noch Zeit für Fragen.
Besucher: "Gibt es für den Notfall auch wo eine zweite Schaltzentrale, falls zum Beispiel ein Flugzeug auf diese hier draufstürzen sollte?"
Angestellter: "Ja, gibt es, aber der Ort von der ist streng geheim, den darf ich Ihnen nicht verraten."
[2 Minuten später]
Besucher: "Und wie lange dauert es, bis man im Notfall alles zu dieser Notfallzentrale umgeleitet hätte?"
Angestellter, halb antwortend, halb laut denkend: "Naja, da müssten wir eigentlich eh nur die Mitarbeiter nach Badenburg führen..."

Informationssicherheit -- Tipp Nr. 763: Codenamen für geheime Orte

Streng geheime Orts- oder Namensangaben brauchen interne Codenamen. Ob man den Ort nun Kaninchenbau, Tschernobyl oder Alices Wunderland nennt, ist völlig egal. Wichtig ist, dass der Name einprägsam ist und intern wirklich wie selbstverständlich verwendet wird.

Nicht nur, dass interne Gespräche abgehört werden könnten; Es kann wie im oben beschriebenen Fall auch dem in Sicherheitsfragen am besten geschulten Mitarbeiter schlicht und einfach einmal passieren, dass er sich verplappert. Und in diesem Fall ist es dann immer noch wesentlich günstiger, wenn er versehentlich verrät, dass die Notfallzentrale auf der Blümchenwiese steht, statt den tatsächlichen Ortsnamen auszuplaudern.

lG Birgit

Tuesday, January 19, 2010

Kuhhandel-Probleme

Vorbemerkung: 0 ist in diesem Artikel keine natürliche Zahl, und alle Teilmengen sind nicht-leer.

Gewisse Aspekte von "Kuhhandel" lassen sich folgendermaßen vereinfacht darstellen:

Es gibt 10 Karten: A, B, C, D, E, F, G, H, I, K. Jede dieser Karten hat einen Wert. (Im Originalspiel: 10, 40, 90, 160, 250, 350, 500, 650, 800, 1000.) Am Ende des Spieles hat jeder Spieler einige dieser 10 Karten. Die Punkte für den Spieler ergeben sich nun aus (Summe der Kartenwerte) * (Anzahl der Karten).

Beispiel:

Birgit hat B (40) und F (350). Punkte: (40 + 350) * 2 = 780.
Martin hat A (10), C (90) und E (250) Punkte: (10 + 90 + 250) * 3 = 1050.

Ein weiteres Beispiel:

Birgit hat C (90) und D (160). Punkte: (90 + 160) * 2 = 500.
Martin hat G (500). Punkte: (500) * 1 = 500.

Wie wir im zweiten Beispiel sehen, kann es bei diesen Kartenwerten also ein Unentschieden geben.

Fragestellung

Man finde Werte für die 10 Karten, sodass kein Unentschieden möglich ist und der höchste Wert minimal ist.

Mathematisch formuliert: Man finde eine 10-elementige Menge natürlicher Zahlen mit möglichst kleinem maximalen Element, die folgende Bedingung erfüllt: Es existieren keine zwei disjunkten Teilmengen, sodass (Anzahl der Elemente in der Teilmenge) * (Summe der Elemente in der Teilmenge) für beide Teilmengen dasselbe Ergebnis liefert.

Oder: Man finde eine 10-elementige Menge natürlicher Zahlen mit möglichst kleinem maximalen Element, die folgende Bedingung erfüllt: Die Wertigkeiten ((Summe der Elemente) * (Anzahl der Elemente)) von 2 Teilmengen sind höchstens dann gleich, wenn mindestens 1 Element in beiden Teilmengen vorhanden ist.

Hinweis: Es gibt mindestens eine Lösung (1,10,100,1000,...,1000000000), daher muss es auch eine kleinste Lösung geben.

Bekannte [nicht optimale] Lösungen und Nichtlösungen

1,10,100,1000,...,1000000000 ist eine Lösung
1,2,4,8,...,512 ist keine Lösung: {2,4,16} = 66 = {1,32}
1,2,3,5,11,17,31,112,171,326 ist die Greedy-Lösung
1,4,6,7,8 ist eine optimale Lösung für n=5
1,4,7,12,13,14 ist eine optimale Lösung für n=6
1,2,3,13,19,22,25 ist eine optimale Lösung für n=7
1,2,3,22,32,38,42,45 ist eine Lösung für n=8
1,4,7,23,32,40,41,42 ist eine optimale Lösung für n=8
1,2,3,20,43,70,76,79,82 ist eine Lösung für n=9
1,2,3,43,61,70,76,79,82 ist eine Lösung für n=9

Verallgemeinerte Fragestellungen

Man finde Kartenwerte für 10 bzw. n Karten, sodass kein Unentschieden möglich ist und der höchste Kartenwert möglichst klein ist. Und/oder:
Man finde einen Algorithmus, der obiges Problem (für n=10) in < 24 Stunden berechnet.
Vermutung (Birgit): Die Fragestellungen (*) und (**) (siehe unten) sind beide NP-vollständig (und folglich die obige erst recht).

(*): Gegeben eine Menge von n Zahlen. Man bestimme, ob es zwei Teilmengen dieser Menge mit gleichem Ergebnis ((Summe der Elemente) * (Anzahl der Elemente)) gibt.

(**): Für eine Zahl n bestimme man die kleinste Zahl Z(n), für die eine Menge von n natürlichen Zahlen existiert mit Z(n) als größter Zahl, sodass in der Menge keine Teilmengen mit gleichem Ergebnis ((Summe der Elemente) * (Anzahl der Elemente)) existieren.

(Problem posed by: Martin Windischer)

lG Birgit

Monday, January 18, 2010

Und wie hoch ist Ihr Cholesterin?

Meine Datenschützerseele kocht. Über den "Gesundheitsfragebogen" einer großen österreichischen Versicherung. Größe und Gewicht gehören da noch zu den harmlosesten Fragen. Es geht damit weiter, dass man sämtliche aktuellen und früheren Krankheiten angeben muss, inklusive psychischer Erkrankungen, weiters durchschnittlichen Alkohol-, Nikotin- und Drogenkonsum, und Erkrankungen bei den Eltern und Geschwistern.

Die wahre Frechheit beginnt dann aber beim Kleingedruckten: "Die zu versichernden Personen stimmen ausdrücklich zu, dass der Versicherer (...) bei Dritten (Ärzten, Krankenanstalten, sonstigen Einrichtungen der Krankenversorgung oder Gesundheitsvorsorge, Sozialversicherungsträgern, Versicherungsunternehmen, sonstigen Versicherungseinrichtungen, Behörden usw.) alle für erforderlich erachteten Erkundigungen einzieht; sie entbinden die Befragten im Voraus für jeden Fall von der ärztlichen und sonstigen beruflichen Schweigepflicht."

Ja, ich gebe zu, ich bin ein musterhafter Versicherungsnehmer: Jung, keine schweren Krankheiten, keine Risikosportarten, kein Alkohol, kein Nikotin, in meinem ganzen bisherigen Leben keinen einzigen Tag im Krankenstand gewesen. (In bösen Witzen wird an dieser Stelle dann gefragt, wozu man überhaupt lebt.)

Aber was, wenn's nicht so wäre? Was ist mit den vielen Leuten, bei denen es nicht so ist? Wenn wir anfangen, Prämien je nach Risiko zu berechnen, sind wir in kürzester Zeit beim amerikanischen System, und dann können wir uns unser schönes, weltweit gelobtes Gesundheitsversicherungssystem quasi eh schon in die Haare schmieren.

Falls tatsächlich ein Versicherungsfall eintritt, schön und gut, dann gestehe ich der Versicherung das Recht zu zu überprüfen, ob a) der Versicherungsfall "echt" ist und b) ob er tatsächlich während der Laufzeit der Versicherung eingetreten ist oder davor schon bestand. Das ist wohl auch notwendig, um echten Betrug zu verhindern.

Aber solange die Versicherung sowieso nur von mir Geld kassiert ohne dass ich irgendwelche Leistungen in Anspruch nehme, ist es eine Unverschämtheit sonder gleichen, dass man sich gleich einmal das Recht sichert, bei sämtlichen Ärzten, bei denen ich je war, alle Informationen über mich einzuholen. Und ja, natürlich muss die Adresse des Hausarztes angegeben werden, sowie alle weiteren Ärzte, bei denen man innerhalb der letzten 3 Jahre war. Sogar, ob innerhalb der letzten vier Monate eine spezielle Untersuchung (EKG, Labor, Röntgen, MRT, ...) durchgeführt wurde, muss man angeben -- vermutlich, damit die Versicherung sich schnell die Ergebnisse davon holen kann. So gesehen kann es einem glatt passieren, dass es ein Nachteil ist, vor Kurzem zur Gesundenuntersuchung gegangen zu sein.

Vermutlich ist es sowieso ein Nachteil, zur Gesundenuntersuchung zu gehen -- weil man dann zum Beispiel nicht mehr wahrheitsgemäß angeben kann, dass man nicht gewusst hat, dass man zu viel Cholesterin im Blut hat. Und ja, natürlich wird gefragt, ob das Cholesterin zu hoch ist, ebenso wie Blutdruck und noch ein paar Sachen, die man nicht wirklich selbst merkt, wenn es einem der Arzt nicht sagt. Weiters ist es natürlich auch ein Nachteil, sich wegen irgendwas behandeln zu lassen, denn natürlich müssen auch Krankenhaus- und Kuraufenthalte angegeben werden, ganz zu schweigen von Medikamenten. Übrigens allen Medikamenten, die man jemals regelmäßig eingenommen hat.

Wollen wir tatsächlich ein Versicherungssystem, in dem es ein Nachteil ist, rechtzeitig zum Arzt zu gehen?

Wollen wir tatsächlich ein Versicherungssystem, in dem man dafür bestraft wird, mit der falschen genetischen Veranlagung geboren worden zu sein?

Wollen wir tatsächlich ein Versicherungssystem, in dem man gesund sein muss, damit man sich leisten kann, krank zu werden?

Ich nicht.

lG Birgit

Thursday, January 14, 2010

Warum weniger Sicherheit manchmal mehr ist
Oder: Wie fälsche ich den Wohnungsschlüssel meines Nachbarn?

Hier ein Beitrag um zu beweisen, dass Informatik sich mit viel mehr beschäftigt als nur mit Computern. Beispielsweise auch mit ganz profanen Türschlössern.

Nehmen wir an, jemand besitzt eine Maschine zum Fräsen von Schlüsseln, und beliebig viele Rohlinge. Um nun einen Schlüssel für ein Schloss anzufertigen, dessen Code man nicht kennt, kann man natürlich einfach alle Möglichkeiten durchprobieren, bis man den korrekten Schlüssel erwischt hat. Bei einem klassischen österreichischen(*) Türschloss sind das etwa 100000 mögliche Kombinationen, nämlich 5 Stellen mit Werten zwischen 0 und 9. Die Schlüssel lassen sich entsprechend bezeichnen mit 00000 bis 99999.

Entsprechend wird man durchschnittlich 50000.5 Schlüssel anfertigen müssen, bis man den richtigen erwischt. Wenn man für einen Schlüssel und das Ausprobieren desselben eine Minute benötigt, dann dauert das (im Durchschnitt) etwa 35 Tage -- essen und schlafen noch nicht mitgerechnet. Für eine praktische Umsetzung ist das viel zu langwierig, daher "sicher". Außerdem fällt es auf, wenn jemand 35 Tage lang vor einer fremden Wohnungstür sitzt und Schlüssel ausprobiert.

Nehmen wir nun aber an, die Wohnung befindet sich in einem Mehrparteienwohnhaus mit einem Schließsystem und einer gemeinsam gesperrten Haupteingangstür. Das heißt, jeder Bewohner hat einen Schlüssel, der seine eigene Wohnung sperrt und die gemeinsame Tür.

Nun müssen wir uns ein wenig damit befassen, wie so eine Schließanlage funktioniert. Beginnen wir einmal mit der Funktionsweise eines normalen Zylinderschlosses. Dieses besteht aus Federn und Stiften. Der richtige Schlüssel drückt die Stifte genau so weit hinein, dass der Spalt zwischen Feder und Stift genau an der Kante zwischen Schloss und drehbarem Zylinder ist; Nur dann kann der Zylinder gedreht werden. Die nächsten Bilder verdeutlichen diese Beschreibung.

Skizze 1: Zylinderschloss

Skizze 2: Zylinderschloss mit richtigem Schlüssel

Skizze 3: Zylinderschloss mit falschem Schlüssel

Um zu erreichen, dass eine Tür von mehreren verschiedenen Schlüsseln gesperrt werden kann, werden statt der durchgehenden Stifte solche Stifte verwendet, die sich aus mehreren Teilen zusammensetzen. Der Zylinder kann dann gedreht werden, wenn sich an jeder der fünf Schlüsselpositionen eine der Spalten im Stift an der Kante zwischen Zylinder und Schloss befindet. Im Folgenden sind Skizzen von einem Schloss, das mit zwei Schlüsseln gesperrt werden kann.

Skizzen 4 und 4a: Zylinderschloss mit zwei sperrenden Schlüsseln

Die beiden Schlüssel in Skizzen 4 und 4a haben die Nummern 85173 und 85143 -- beschriftet von links nach rechts, größere Ziffern bedeuten längere Stifte / tiefere Kerben. Offensichtlich müssen vier der fünf Stellen gleich sein.

Bei dem Schloss aus Skizzen 4 und 4a gibt es zwei mögliche Schlüssel, folglich könnte man es als gemeinsam gesperrte Tür für zwei Wohnungen verwenden. Für mehr Wohnungen braucht man entsprechend mehr unterteilte Stifte. Beispielsweise könnte man den bereits geteilten vierten Stift ein weiteres Mal unterteilen, sodass die drei Kombinationen 85143, 85173 und 85193 möglich sind. Oder man könnte einen weiteren Stift teilen, zum Beispiel den ersten, um die Kombinationen 25143, 25173, 85143 und 85173 zu erhalten.

Kurz und gut, wie auch immer man es anstellt, die Schlüssel, die die gemeinsame Tür sperren, weisen ein Muster auf. Und genau das nutzen wir nun aus.

Nehmen wir an, ich besitze einen gültigen Schlüssel für die gemeinsam gesperrte Tür -- nämlich den meiner eigenen Wohnung. Habe dieser zum Beispiel die Nummer 12345. Um herauszufinden, wie das Schloss der gemeinsamen Eingangstür aufgebaut ist, brauche ich lediglich 45 Schlüssel, nämlich jene, die entstehen, wenn man jeweils nur genau eine Stelle verändert. Wenn 12345 ein gültiger Schlüssel ist, dann erfahre ich durch Durchprobieren der Schlüssel 22345, 32345, 42345, ..., 92345, 11345, 13345, ..., 19345, 12145, 12245, ..., 12945, ...... 12349 die Positionen aller Spalten in den Stiften des gemeinsamen Türschlosses. (Von den vier unveränderten Stellen weiß ich mit Sicherheit, dass ein Spalt an der richtigen Stelle ist. Falls der Schlüssel mit einer veränderten Position nicht sperrt, muss also diese eine veränderte Stelle diejenige sein, bei der der Stift nicht in einer richtigen Position ist. Sperrt der Schlüssel, so ist auch für die veränderte Stelle ein Spalt im Stift vorhanden.)

Aus den Positionen dieser Spalten in den Stiften lassen sich nun wiederum alle Schlüssel konstruieren, die das Schloss sperren, selbst wenn sich diese noch nicht unter den durchprobierten Schlüsseln befanden. Wenn 18345 und 12347 nämlich beide sperren, dann muss auch 18347 ein gültiger Schlüssel sein.

Jetzt bleibt zu hoffen, dass die gemeinsame Tür so gebaut ist, dass möglichst wenige Schlüssel sie sperren. Das ist auch üblicherweise der Fall. Bei 8 Wohnungen wird man zum Beispiel drei Stifte in je zwei Teile teilen, bei 24 Wohnungen einen vierten Stift in drei Teile zerlegen. Bei 37 Wohnungen wird man nicht umhin kommen, das Schloss so zu bauen, dass mindestens 40 Schlüssel sperren. (Man kann keinen Stift in mehr als 10 Teile teilen. 37 ist eine Primzahl, 38 = 2*19 enthält einen zu großen Primfaktor, 39 = 3*13 ebenso, und 40 = 2*2*2*5 ist möglich.)

Wie auch immer, im Normalfall wird es etwa gleich viele mögliche Schlüssel geben wie Wohnungen. Und Wohnblöcke mit mehr als 50 oder 100 Wohnungen sind nicht gerade häufig. Folglich fertigt man nun diese höchstens 100 möglichen Schlüssel an, und einer davon wird die Nachbarwohnung sperren.

Wenn man wie oben etwa 1 Minute für jeden gefertigten Schlüssel annimmt, kommt man mit den 45 + ca. 100 Schlüsseln auf etwas mehr als zwei Stunden Arbeit. Das ist etwas, wofür der Nachbar noch nicht einmal auf Urlaub sein muss. Soviel zum Untertitel.

Setzen wir den Gedanken noch ein wenig fort. Nehmen wir an, jemand, der nicht einmal einen Wohnungsschlüssel für eine andere Wohnung im gleichen Haus besitzt, will einen Schlüssel fälschen. Sobald wir einen gültigen Schlüssel für irgendeine Wohnung im Haus haben (oder auch nur für die gemeinsame Eingangstür), kann mit der oben beschriebenen Methode offensichtlich recht effizient auch ein Schlüssel gefunden werden für diejenige Wohnung, in die eingebrochen werden soll. (Entsprechend müssten eigentlich jedes Mal, wenn irgendwer seinen Schlüssel zu einer Wohnung verliert, die Schlösser bei allen Wohnungen ausgetauscht werden.)

Wie lange dauert es nun also, einen Schlüssel für die gemeinsame Eingangstür zu finden? Offensichtlich ist das leichter als bei einer einzelnen Wohnung, da es mehr gültige Schlüssel gibt. Nehmen wir an, es gibt k Wohnungen, und das gemeinsame Schloss lässt sich von genau k Schlüsseln sperren. Dann braucht man im Durchschnitt 100001/(k+1) Versuche, bis man einen dieser Schlüssel erraten hat.

Um in eine bestimmte Wohnung einzubrechen braucht man nun also im Durchschnitt 100001/(k+1) + 45 + (k+1)/2 Schlüssel. (Erraten eines Schlüssels für die gemeinsame Tür + Bestimmen des Aufbaus des gemeinsamen Schlosses + Erraten des richtigen der k gültigen Wohnungsschlüssel.) Im Gegensatz zu einer einzelnen Wohnung (für die man wie oben beschrieben im Durchschnitt 50000.5 Schlüssel braucht), benötigt man für eine Wohnung in einem Wohnblock mit 2 Wohnungen nur noch durchschnittlich 33380.17 Schlüssel, also um 33% weniger. Bei 10 Wohnungen benötigt man nur mehr 9141.5 Versuche, also 81% weniger als bei einer einzelnen Wohnung.

Einige weitere Zahlen:

Anzahl Wohnungen	Durchschnittliche Versuche	Sicherheitsverlust in Prozent
2	33380,17	33,24
3	25047,25	49,91
4	20047,7	59,91
5	16714,83	66,57
10	9141,5	81,72
20	4817,45	90,37
50	2031,3	95,94
100	1085,61	97,83
200	643,02	98,71

Bei 446 Wohnungen schließlich benötigt man durchschnittlich nur noch 492 Schlüssel, etwa 8 Stunden und somit um 99% weniger als wenn es keine gemeinsame Eingangstür gäbe.

Gibt es noch mehr Wohnungen im Wohnblock, so steigt die Schwierigkeit langsam wieder, da zwar die Eingangstür nun leichter zu erraten ist, danach aber umso mehr gültige Schlüssel existieren.

Daraus folgt, passend zum ersten Titel: Obwohl bei einer gemeinsamen Eingangstür zwei versperrte Türen zwischen dem Einbrecher und der Wohnung stehen statt nur einer, ist die Sicherheit geringer. So gesehen wäre es am sichersten, gar keine gemeinsame Eingangstür zu haben, oder zumindest eine, bei der jeder Schlüssel sperrt. (Natürlich nur aus informatisch-theoretischer Sicht. Praktisch braucht man bei zwei Türen trotzdem einmal öfter das Brecheisen als bei einer.)

Noch allgemeiner betrachtet kann man statt der 100000 Möglichkeiten ein Schlüsselsystem annehmen, in dem es n mögliche Schlüssel gibt (beispielsweise durch zusätzliche Stellen oder mehr Ziffern pro Stelle). Sei wie oben k die Anzahl der Wohnungen im Wohnblock. Dann beträgt die durchschnittliche Anzahl der Versuche (n+1)/(k+1) + (k+1)/2 + C, wobei C die Konstante für das Austesten des gemeinsamen Schlosses ist, im obigen Sonderfall also 45. Die geringste Sicherheit besteht dann genau dann, wenn die Anzahl der Wohnungen k gleich ((Wurzel[2*n+2])-1) ist, und beträgt dann ((Wurzel[2*n+2])+C). Größenordnungsmäßig reduziert sich die Schwierigkeit im schlimmsten Fall also von O(n) auf O(Wurzel[n]).

Übrigens besteht die Reaktion der Hersteller von Schließsystemen bislang nicht im Entwurf neuer Systeme, sondern schlicht und einfach in der Erhöhung der Anzahl der Möglichkeiten. Mit bis zu 8 seitlichen Einkerbungen produziert Winkhaus beispielsweise ein Schloss mit 256000000 Möglichkeiten. Andere Schließsysteme verwenden Magneten für zusätzliche Stellen.

Eine der wenigen Möglichkeiten, die tatsächlich das Grundproblem beheben, sind elektronische Schließsysteme, bei denen einfach jeder Schlüssel eine Nummer hat und jedes Schloss eine Liste derjenigen Nummern speichert, bei denen es sperren soll.

lG Birgit
... die übrigens keine Schlüsselfräse besitzt.

(*) Deshalb österreichisch, weil jedes Land etwas andere Schlüsselnormen verwendet. Die meisten Prinzipien dieses Artikels funktionieren aber auch dort.

P.S.: Als Zuckerl für alle, die bis hierher gelesen haben: http://www.xkcd.com/538/

Saturday, December 26, 2009

Ein Hoch auf die Freeware

(Go to English version.)

Mit jedem Neuaufsetzen des Computers stelle ich wieder fest, dass auf meinem Computer mehr und mehr Freeware läuft. Früher bestand das Neuaufsetzen aus dem Einlegen von 20 CD-ROMs nacheinander -- MS Windows, MS Word, MS Office, Paint Shop Pro, ... --, heutzutage heißt Neuaufsetzen bei mir: Zuerst Windows installieren, dann herunterladen der neuesten Versionen aller restlichen Programme.

Ein Hoch also auf die Freeware -- die gemäß einiger Quellen aus den USA übrigens ein höchst kommunistisches Konstrukt ist ;) --, Dank der ich mittlerweile außer Windows kaum mehr proprietäre Software verwende.

Hier die Liste der Freeware-Programme (bzw. teilweise Shareware oder Demo-Versionen), die auf meinen Computer üblicherweise installiert sind:

Grundlegendes:

Acrobat Reader / Foxit Reader: Anzeige von .pdf-Dateien
GhostView / GhostScript: Anzeige von .ps-Dateien
PDF24 Creator: Erstellung und Bearbeitung von .pdf
pdf995: Erstellung von .pdf-Dateien per Druckertreiber
TortoiseSVN: Versionskontrolle mit SVN
CDBurnerXP: Brennen von CDs und DVDs
7-Zip: Kompressionsprogramm für (fast) alle Formate
cygwin: Linux Emulator
DOSBox: DOS Emulator

Erstellung und Bearbeitung von Dokumenten:

nodepad++: Texteditor und Programmcode-Editor
LibreOffice: Office-Programme: Textverarbeitung, Tabellenkalkulation, Präsentationen, ...
MikTeX: Kompilation von LaTeX-Dokumenten
WinShell: LaTeX-Editor
Asymptote: Programmiersprache und Compiler zur Erstellung von Vektorgrafiken
GeoGebra / Euklid DynaGeo: Erstellung und Bearbeitung interaktiver Geometrie-Skizzen

Programmieren:

eclipse: Entwicklungsumgebung
Java JDK: Java (Entwicklungstools und Virtual Machine)
Visual C++ Express: C++ (Entwicklungsumgebung und Compiler)
Python: Python
SWI Prolog: Prolog (Entwicklungsumgebung und Compiler)

Grafik:

IrfanView: Anzeige von Bilddateien
Gimp: Bildbearbeitung
Paint.net: Bildbearbeitung
Inkscape: Vektorgrafik-Editor
autostitch: Zusammensetzen von größeren Fotos aus mehreren Aufnahmen

Musik und Multimedia:

iTunes: Musikwiedergabe, Download und Wiedergabe von Podcasts, Verwaltung der Dateien auf dem iPod
VLC Media Player: Wiedergabe von Videos und DVDs
Winamp: Musikwiedergabe
Amarok: Musikwiedergabe
VirtualDub: Videoaufnahme und -bearbeitung
NoteWorthy Composer (Demo): Erstellung von Notenblättern

Internet:

Firefox: Browser
Chrome: Browser
Thunderbird: Email- und Newsgroup-Client
IMAPSize: Backup von IMAP-Email-Accounts
PuTTY: SSH- und Telnet-Client
WinSCP: FTP- und SFTP-Client mit GUI
pidgin / qip: Instant Messenger (für ICQ, AIM, ...)
ChatZilla: IRC-Client
Skype: Skype-Client (für Internettelefonie)
Apache: Webserver (zum lokalen Testen von Homepages)
Vuze: Client für Peer-to-Peer Filesharing

Antivirus:

Avira AntiVir: Antivirenprogramm
Spybot: Anti-Spyware-Programm

Datenbanken:

MySQL: MySQL-Datenbanksystem
NaviCat Lite: GUI für MySQL

lG Birgit

Edit (2011-02-18): Add Foxit Reader and PDF24 Creator.

Edit (2011-02-24): Update OpenOffice.org to LibreOffice.

Tuesday, December 22, 2009

Cancer leads to the purchase of mobile phones!

There's an abundance of scientific studies that can be roughly summarized as: "People who own mobile phones more often have cancer than people without mobile phones, which means that mobile phones do cause cancer." Based on those studies I want to prove today that cancer leads to the purchase of mobile phones.

Let the rectangle below represent the population of the country/region where such a study was conducted, separated into owners of mobile phones (M) and people who don't own or use mobile phones (nM).

The abovementioned studies now say that among M, a higher percentage of the people has cancer than among nM. We draw two lines to represent this. The only important thing is that the line in the M-rectangle is higher than the line in the other one.

Let's now compare the percentage of people with cancer who own mobile phones to the same percentage among people without cancer. We label the distances as shown below.

Among cancer patients, the percentage of people who own mobile phones is:

Among peope without cancer, this percentage is:

Under the assumption that a > b, it can easily be shown that

holds true (for all positive values for x and y).

(This can also be easily shown with more words and less formulas: If a = b holds, it's obvious that the two compared percentages are equal. Let now the line for b "move downwards", then we see that the first percentage grows, whereas the second one shrinks.)

Thus it is shown that people who have cancer more often own mobile phones than people who don't have cancer.

With the same argumentation as in the beforementioned studies, we can therefore conclude that cancer leads to the purchase of mobile phones.

-----------------------------------------------------------------

Wondering where's the catch? It's simply not legal to infer from "A and B often occur together" that "A leads to B". Sadly, many so-called "scientific" studies about cancer and mobile phones (and a dozen other popular topics) use exactly this train of thought.

Birgit Vera Schmidt

P.S.: I'll have another article about this topic some day.