code logs -> 2008 -> Wed, 23 Jul 2008< code.20080722.log - code.20080724.log >
--- Log opened Wed Jul 23 00:00:37 2008
01:15 Rico is now known as EvilDarkLord
02:52 Thaqui [~Thaqui@Nightstar-1405.jetstream.xtra.co.nz] has joined #code
02:52 mode/#code [+o Thaqui] by ChanServ
03:07 Attilla [~The.Attil@92.9.153.ns-3442] has quit [Quit: <Insert Humorous and/or serious exit message here>]
05:16 Vornicus is now known as Vornicus-Latens
06:01 Doctor_Nick [~nick@Nightstar-23917.hsd1.fl.comcast.net] has quit [Ping Timeout]
06:27 Doctor_Nick [~nick@Nightstar-23917.hsd1.fl.comcast.net] has joined #code
07:02 AnnoDomini [AnnoDomini@Nightstar-29805.neoplus.adsl.tpnet.pl] has joined #Code
07:02 mode/#code [+o AnnoDomini] by ChanServ
08:44 You're now known as TheWatcher
11:23 AnnoDomini [AnnoDomini@Nightstar-29805.neoplus.adsl.tpnet.pl] has quit [Ping Timeout]
11:30 AnnoDomini [AnnoDomini@Nightstar-29083.neoplus.adsl.tpnet.pl] has joined #Code
11:30 mode/#code [+o AnnoDomini] by ChanServ
11:37 Thaqui [~Thaqui@Nightstar-1405.jetstream.xtra.co.nz] has left #code [Leaving]
12:34 Attilla [~The.Attil@92.9.153.ns-3442] has joined #code
12:34 mode/#code [+o Attilla] by ChanServ
13:07 EvilDarkLord [~jjlehto3@Nightstar-9591.cs.hut.fi] has quit [Client exited]
17:00 EvilDarkLord [~jjlehto3@Nightstar-9591.cs.hut.fi] has joined #code
17:00 mode/#code [+o EvilDarkLord] by ChanServ
18:11 AnnoDomini [AnnoDomini@Nightstar-29083.neoplus.adsl.tpnet.pl] has quit [Ping Timeout]
18:21 AnnoDomini [AnnoDomini@Nightstar-29083.neoplus.adsl.tpnet.pl] has joined #Code
18:21 mode/#code [+o AnnoDomini] by ChanServ
18:25 Syloqs-AFH [Syloq@Admin.Nightstar.Net] has quit [Ping Timeout]
18:26 Vornicus-Latens is now known as Vornicus
19:52
<@McMartin>
A new I7 build appears to be in the works.
19:54
<@ToxicFrog>
Cool.
21:55
<@ToxicFrog>
GCC for embedding asm is _asm { code goes here }, yes?
21:57
<@McMartin>
I believe so yes
21:57
<@McMartin>
I also believe you have to use the goddamn alien AT&T syntax.
21:57
<@ToxicFrog>
All I'm doing is int 3, so.
21:58
<@ToxicFrog>
...also, it's picky about whitespace.
21:58
<@ToxicFrog>
_asm {
21:58
<@ToxicFrog>
int 3;
21:58
<@ToxicFrog>
}
21:58
<@ToxicFrog>
Is legal.
21:58
<@ToxicFrog>
_asm { int 3; }
21:58
<@ToxicFrog>
Is not.
21:58
< Vornicus>
...what the cheese?
21:59
<@McMartin>
Assembler does tend to be "one statement per line"
21:59
<@McMartin>
; may be comments~
21:59
<@ToxicFrog>
Aah yes.
21:59
<@ToxicFrog>
Didn't think of that.
21:59
<@ToxicFrog>
Vornicus: in SGOS, at least, int 3 is "invoke the debugger"
21:59
< Vornicus>
aha
22:00
<@ToxicFrog>
And I am currently working to track down extra happy fun time kernel panic on free() bug.
22:00 * Vornicus does battle with naming and semantics.
22:00
<@McMartin>
That reminds me. Did the DNS bug hit you guys?
22:00
<@McMartin>
Or is that information currently classified
22:02
<@ToxicFrog>
I have no idea if it's classifed or not, and I don't actually know the answer (having only passing familiarity with our DNS subsystems), so I won't comment on it.
22:03
<@McMartin>
Fair enough
22:05
<@ToxicFrog>
Aha. It's actually on the website, so I guess it's not classified.
22:05 You're now known as TheWatcher[T-2]
22:06
<@ToxicFrog>
And the answer is "most of our products are, although some of those are only affected to the degree of DoS; patches are available".
22:07 * Vornicus fiddlefiddles.
22:07 You're now known as TheWatcher[zZzZ]
22:08 * ToxicFrog works on sanifying his vstruct unit test framework
22:09
< Vornicus>
Okay. In Lua what I'd do is make this class generate other classes.
22:09
< Vornicus>
Also in C++.
22:10
<@McMartin>
Er, like dynamic compilation or like factory methods?
22:11
<@ToxicFrog>
Factory methods generate objects, I thought?
22:11
<@ToxicFrog>
I didn't think you could generate classes at runtime in C++
22:11
<@McMartin>
Depending on your language, classes are objects too.
22:11
<@ToxicFrog>
(well, not portably)
22:11
<@ToxicFrog>
Yeah.
22:11
<@McMartin>
And I read it as "generate objects of other classes"
22:11
<@ToxicFrog>
But I'm fairly sure that's not the case in C++.
22:11
<@ToxicFrog>
Aah.
22:11
<@ToxicFrog>
I read it as "generate new classes, which can then be instantiated if you like"
22:12
< Vornicus>
What TF said.
22:12
<@McMartin>
... yeah, so, uh, how is that possible in C++
22:12
< Vornicus>
Fake it with templating.
22:13
< Vornicus>
What I have is a sorted list class that takes a custom comparator/key thing
22:13
<@McMartin>
So this isn't at runtime then
22:13
< Vornicus>
It would be in Python.
22:14
<@McMartin>
Pass in the comparator as a callable argument to the constructor, imo
22:14
< Vornicus>
McM: I think we're getting too gummed up in the specifics of C++ here though.
22:14
< Vornicus>
RIght, that's what I've got.
22:14
<@McMartin>
So what's the problem~
22:15
< Vornicus>
But... I also want to be able to say "okay here's a sorted list, I want another one that works the same way"
22:15
<@McMartin>
Make the comparator an accessible field, or have a clone() method
22:16
<@McMartin>
The Java approach would be new TreeSet(otherSortedSet.getComparator()), I believe.
22:18
< Vornicus>
Right now I havea "behavior" argument which takes another sorted list.
22:40
<@MyCatVerbs>
Y'know, the more I read my system's man pages, the less need I see for any language other than C.
22:41
<@MyCatVerbs>
setcontext(2) and getcontext(2) - give you first-class continuations. If you poke around dlsym and friends (especially with the GNU extensions) there's actual reflection in there.
22:42
<@MyCatVerbs>
Hell, nothing stopping you from invoking a C compiler at runtime, building yourself a .so and then loading it.
22:42
<@ToxicFrog>
Just because it's possible doesn't mean it's easy.
22:42
<@ToxicFrog>
Or painless.
22:43
<@ToxicFrog>
Or in any way portable.
22:44
<@ToxicFrog>
In particular, everything in section 2 is a system call, which means it's almost certainly not portable to non-*nix OSes, and possibly not portable to other *nixes or even other major kernel versions of the same OS.
22:45
<@ToxicFrog>
And it's a lot harder to use - both "at all" and "effectively" - than, say, call/cc.
22:45
<@MyCatVerbs>
Oh yeah.
22:46
<@MyCatVerbs>
By the time I hit the end of the dlsym manpage I was giggling at just how improbable it would be for such a program to ever *not* segfault.
22:46
<@McMartin>
To say nothing of "without providing black hats 37,523 new ways to run arbitrary code on any system with your program installed"
22:46
<@ToxicFrog>
That too.
22:46
<@MyCatVerbs>
That's okay. We'll add a preprocessor that provides a Turing-complete type checker.
22:47
<@MyCatVerbs>
Then any programs that you wish to see make it through have to include annotations specifying a proof of correctness. :)
22:50
<@McMartin>
You mentioned .sos.
22:50
<@McMartin>
It is theoretically impossible to verify programs that dynamically link in components not guaranteed to be present or unchanged at run time.
22:50
<@McMartin>
And even if you verify it, if the attacker can hack the load path, pwnt
22:50
<@McMartin>
Especially since in UNIX hacking the load path involves "setting environment variables".
22:53
<@MyCatVerbs>
I'm curious as to how common it is that an attacker can set environment variables, but can't alter the executables I'll be running in any other way.
22:54
<@ToxicFrog>
Typically excutables are installed by the sysadmin in /usr/local/bin, which is root-only
22:54
<@MyCatVerbs>
I mean, there're obvious things like altering PATH so that I won't even get the right executable in the first place...
22:54
<@ToxicFrog>
On the other hand, setting environment variables is just a matter of going env LD_LIBRARY_PATH=~/evil/my_hacks/ foo...
22:54
<@McMartin>
Indeed
22:54
<@McMartin>
All you need to be able to modify environment variables is the ability to *execute* the target program.
22:54
<@MyCatVerbs>
Which, fine, I just need to use a fully-qualified pathname when I invoke the executable.
22:55
<@McMartin>
they don't have to
22:55
<@McMartin>
And even then, LD_LIBRARY_PATH is what libdl checks first.
22:55
<@MyCatVerbs>
McMartin: you seem to be assuming that my executable is suid or sgid to soemone with priviledges.
22:55
<@MyCatVerbs>
McMartin: and not if I give a fully-qualified pathname, AFAIK.
22:56
<@McMartin>
Only for PATH attacks.
22:56
<@McMartin>
LD_LIBRARY_PATH has nothing to do with PATH
22:56
<@MyCatVerbs>
McMartin: yuhuh. Just tested it. ;P
22:57
<@ToxicFrog>
...I think we're losing sight of the fact that this is a terrible idea by getting bogged down in discussion of actual exploits.
22:57
<@MyCatVerbs>
McMartin: specifically, if I go dlopen("/specific/directory/foo.so",...) instead of dlopen("foo",...), I'm fine.
22:57
<@McMartin>
Oh.
22:57
<@McMartin>
Yes.
22:57
<@MyCatVerbs>
Besides which, if someone can set environment variables, aren't I fucked anyway? LD_PRELOAD, no
22:58
<@McMartin>
However, now your program is not installable and every sysadmin in the world hates you
22:58
<@McMartin>
Generally speaking, dynamic libraries are Only For System Libraries, or Only For Sandboxes.
22:58
<@MyCatVerbs>
Mmmm.
22:58
<@McMartin>
And the answer is "that depends on whether or not your system has other privilege escalation vulnerabilities."
22:59
<@MyCatVerbs>
McMartin: I mkdtemp(), chdir into it, write the .so files out in there, and then load "./foo.so"
22:59
<@McMartin>
Sysadmins like to own their file hierarchy.
22:59
<@McMartin>
If you hardcode locations, they hate you and hack it so it's not
22:59
<@McMartin>
If you don't, there's a way of invoking it to run arbitrary code as whoever owns that process.
23:00
<@McMartin>
Whether you consider "as whoever owns that process" a vulnerability or not is up to you, but it's typically treated as such.
23:00
<@MyCatVerbs>
But the directory mkdtemp() gives me won't be writable-into by anyone other than the user that my program is running as.
23:00
<@McMartin>
This doesn't change the fact that dynamically loaded code should always either be sandboxed or system-level.
23:01
<@McMartin>
MCV: That only means that invocation won't.
23:01
<@McMartin>
The whole point is that your code says "now load this file and execute the code in it" at various points in its run.
23:01
<@McMartin>
This is Bad Times.
23:01
<@MyCatVerbs>
What do you mean by that? It's a directory in /tmp with 0700 permissions. It has .so files being written into it, and loaded.
23:02
<@McMartin>
You're saying "This is OK because nobody could possibly make it a different file"
23:02
<@McMartin>
And I say "AHAHAHAHAHAHAHAAHAHAHAHAAAAAAA".
23:02
<@MyCatVerbs>
I'm saying no one can write to that directory who isn't either a) the same user that my program is running as anyway or b) root.
23:02
<@McMartin>
I'm saying that won't protect you.
23:02
<@MyCatVerbs>
In both cases, I'm already fucked.
23:03
<@MyCatVerbs>
Yes it will.
23:03
<@McMartin>
No it won't. Some external interactor could overflow a buffer or write through a wild pointer and rewrite the filename to load.
23:03
<@McMartin>
Or any of the dozens and dozens of other data corruption bugs to which C is liable.
23:03
<@MyCatVerbs>
In which case I'm already fucked.
23:03
<@McMartin>
You're relying on your jail working.
23:04
<@McMartin>
Not every overflow or wild pointer is actually exploitable.
23:04
<@McMartin>
You're making any such much easier to exploit.
23:04
<@MyCatVerbs>
Because now you've assumed that my program has stack-smashing vulnerabilities, so the attacker might as well skip bothering to play silly buggers with the ELF file and just run his shellcode directly.
23:04
<@McMartin>
And much harder to defend against.
23:04 * MyCatVerbs thinks for a second.
23:04
<@McMartin>
Stack guards exist.
23:05
<@McMartin>
And no, I was actually assuming heap-smashing possibilities, which are usually much harder to exploit.
23:05
<@MyCatVerbs>
Right. So _now_ is the point where the theorem prover comes in.
23:05
<@McMartin>
At least in C.
23:05
<@McMartin>
Or you could just run the code interpreted in a sandbox.
23:05
<@MyCatVerbs>
Heh. Except that I've just turned pretty much *all* data corruption bugs into arbitrary code execution.
23:05
<@McMartin>
This has the advantage that that's a solved problem, while automatically doing theorem proving has remained an unsolved problem for 50 years.
23:06
<@MyCatVerbs>
Er, it's not unsolved? At least to the extent that it's actually solvable, anyway.
23:06
<@McMartin>
It is still a very active research area.
23:06
<@McMartin>
Unless you mean things like "type checking as you load it"
23:06
<@MyCatVerbs>
You can't take any arbitrary unannotated code and prove it correct or not, sure.
23:07
<@McMartin>
Proving correctness was abandoned en masse in favor of conservative approximations that were usually right, and then those were abandoned because they never accepted any program as ever correct.
23:07
<@MyCatVerbs>
Well, you actually can, but what you can't do is write a system that doesn't occasionally return "fuck off" for actually perfectly acceptable programs (doing so requires you to solve the Halting Problem).
23:07
<@McMartin>
Most of your analyzers out there these days have either "You fucked up" or "I can't tell if you fucked up" as their two states.
23:08
<@McMartin>
Bounds-checking ahead of time reduces to the halting problem.
23:08
<@McMartin>
Bounds-checking at run time, on the other hand, is a single branch instruction.
23:08
<@MyCatVerbs>
Yes, which is why you have to have false positives.
23:08
<@McMartin>
Assuming your language knows about bounds in the first place, which C doesn't.
23:09
<@MyCatVerbs>
Or you can use a BDSM-powered type checker, and write type annotations.
23:09
<@McMartin>
Insufficient for bounds checks.
23:09
<@MyCatVerbs>
Mmmmmnope. Not with a sufficiently powerful type system.
23:09
<@McMartin>
Insufficient for bounds checks and not crippling the algorithms.
23:09
<@McMartin>
It's possible to write correct code that will never check.
23:09
<@McMartin>
Like, you know, strcpy.
23:10
<@McMartin>
Or you can have an infinite type depth.
23:10
<@MyCatVerbs>
Yes, that's the point.
23:10
<@McMartin>
See, the thing is, strcpy is the kind of thing you want to write.
23:10
<@MyCatVerbs>
You need to make the type checker Turing complete, and then annotate your program with proof of its correctness.
23:10
<@McMartin>
It is unreasonable to the point of being able to say "it is impossible for general purpose computing" to insist on guaranteed ranged types everywhere.
23:11
<@McMartin>
No, that's still not good enough.
23:11
<@MyCatVerbs>
Yes it is.
23:11
<@McMartin>
You're assuming correctness is decidable.
23:11
<@MyCatVerbs>
No I'm not.
23:11
<@McMartin>
"So you just make it so that loading any program or module takes a possibly infinite amount of time"?
23:11
<@MyCatVerbs>
I'm assuming that correctness is recognisable (it is) and that a bored enough programmer can write a proof of correctness that a type checker can recognise.
23:11
<@McMartin>
That's not a "just".
23:12
<@MyCatVerbs>
I didn't say "just", you did. :P
23:12
<@MyCatVerbs>
Y'know about the Curry-Howard isomorphism, right? Type checkers /are/ theorem provers?
23:12
<@McMartin>
And type systems are dataflow frameworks.
23:13
<@McMartin>
"What is the value of this integer variable at this program point" is so very, very unsolved.
23:13
<@McMartin>
Without, you know, running to that point and looking.
23:13
<@McMartin>
At which point you might as well just run the program.
23:13
<@MyCatVerbs>
You keep assuming that you're going to get no help at all from the programmer. Why is this?
23:14
<@McMartin>
For two reasons
23:14
<@McMartin>
(a) Programmers won't do the work
23:14
<@MyCatVerbs>
Do tell, have you ever actually *worked* with a powerful static type system? You do understand that Turing-complete type checkers can be used to demonstrate arbitrary theorems about the program's behavoir, right?
23:14
<@McMartin>
(b) Part of doing the work involves structuring the code in very specific ways, given our knowledge of how to demonstrate something.
23:14
<@McMartin>
Yes. My doctoral thesis research is on interprocedural program analysis.
23:15
<@McMartin>
I went dynamic at several points precisely because the best static results in the field are flatly not good enough to get the guarantees I needed to answer the real questions.
23:16
<@McMartin>
The best examples of what you describe that I can think of are things like SLAM.
23:16
<@McMartin>
And SLAM is a very very strict subset of all computations.
23:16
<@MyCatVerbs>
"Yep, but that's a lazy fucker issue, not a mathematical tractability issue" and "no? Because you can rephrase a given proof in arbitrarily many ways, you can give an arbitrarily complicated type to any given lump of code."
23:16
<@MyCatVerbs>
Er, try "every programming language on the planet with dependant typing", because there's more than one, y'know.
23:17
<@McMartin>
Um
23:17
<@McMartin>
You can cast pointer alias information as a type lattice, yes.
23:17
<@McMartin>
However, that type lattice has 10^40 elements even in moderate-sized code.
23:17
<@McMartin>
There are tricks you can use that will maybe make operations and queries on it tractable.
23:17
<@McMartin>
And pointer aliasing is easier than bounds inference.
23:18
<@MyCatVerbs>
McMartin: take strcpy() for instance. All you need for it to be acceptable is to build a proof that it never runs off the end of null-terminated strings, that it writes into the same number of characters of the output buffer as there are characters in the input buffer.
23:18
<@McMartin>
... No.
23:19
<@MyCatVerbs>
Yes.
23:19
<@McMartin>
You also need to prove that there is actually enough space in both buffers at every call point.
23:19
<@MyCatVerbs>
McMartin: it'll come out with a type that expresses the fact that when you use it, the buffer sizes will need to be appropriate, and the source string must be null terminated.
23:19
<@McMartin>
This means that you need to know how much space is available to every argument that can be passed to it.
23:19
<@McMartin>
C does not carry size information with its pointers.
23:19
<@McMartin>
Because C doesn't actually have an array type.
23:20
<@McMartin>
I'm willing to believe that string::copy() can be proven safe.
23:20
<@MyCatVerbs>
McMartin: at the call site, you then need to add a proof that the preconditions hold.
23:20
<@McMartin>
safety isn't just about how it's written; it's about how it's used.
23:20
<@McMartin>
You realize at this point you are no longer writing C.
23:20
<@MyCatVerbs>
Nope.
23:21
<@MyCatVerbs>
The executable code itself, C. The annotations, very definately not C.
23:21
<@McMartin>
Are you claiming that without rewriting the source, it would be possible to write proofs of safety or demonstrable exploits?
23:21
<@McMartin>
I need to be clear on this.
23:21
<@MyCatVerbs>
The annotations - the type annotations here - are the alterations to the source that you are talking about.
23:22
<@McMartin>
No, no that's not at all what I'm talking about.
23:22
<@MyCatVerbs>
No different to writing a C program and then writing a proof of its correctness on a blackboard, except that you're writing the proof in formal language and then using a theorem prover to check that you haven't goofed in the middle.
23:22
<@McMartin>
You're claiming this can be done for any C program.
23:22
<@McMartin>
I disagree sharply.
23:22
<@McMartin>
I can make my bounds dependent on the result of unsolved mathematical problems.
23:22
<@MyCatVerbs>
Well that's fine, because we're not talking about what you suggested, we're talking about what I suggested.
23:23
<@McMartin>
And I'm saying that your solution is inferior to just running it in a sandbox.
23:23
<@MyCatVerbs>
And a theorem prover will reject them summarily unless you accompany them with a proof of the mathematical conjecture. What's the issue?
23:23
<@McMartin>
... that NP exists, basically.
23:24
<@MyCatVerbs>
Oh yeah, so you might have to sit around and wait a while. :)
23:24
<@McMartin>
Let's say I'm investigating the unsolved 2n+1 problem.
23:24
<@McMartin>
For every number we've tried, this property holds.
23:24
<@McMartin>
There is to date no proof that it holds in general.
23:24
<@MyCatVerbs>
And preferrably structure your proofs so that the theorem prover can get through'm quickly.
23:24
<@MyCatVerbs>
McMartin: then it's not acceptable to use.
23:24
<@McMartin>
If some bound hinges on the result of this unsolved problem, it is by hypothesis impossible to prove.
23:24
<@McMartin>
Whereas a dynamic check is one machine instruction and strictly more powerful.
23:25 * ToxicFrog catches up
23:25
<@ToxicFrog>
Sidenote: "the executable code" is not C, it's machine code
23:25
<@McMartin>
So yeah, I like the ability to run programs where I don't already know the answer because that's one of the things computers are for.
23:25
<@McMartin>
And building histograms out of them, etc.
23:26
<@MyCatVerbs>
McMartin: it's not acceptable to use it if you're writing the program and then writing the proof on the blackboard and then getting a class full of grad students to look for holes in it. Why should I expect the formal theorem prover to treat the situation any differently?
23:26
<@McMartin>
... MCV, you're saying that it's never acceptable to write a program that computes histograms of chaotic processes.
23:26
<@McMartin>
This is patently ridiculous.
23:27
<@MyCatVerbs>
No, I'm saying it's never acceptable to write a program that indexes into an array at a point determined by the output of a chaotic process, unless that output is guaranteed to be bounded.
23:28
<@McMartin>
See, I would hold that it's OK to do so if you check before doing the index.
23:28 * MyCatVerbs nodnod.
23:28
<@McMartin>
And type-based range systems have failed to handle this eventuality
23:28
<@MyCatVerbs>
Yes! In fact, a range check would be a fantastically easy way to prove the condition.
23:28
<@McMartin>
Well, sort of.
23:28
<@McMartin>
Not in C, because C doesn't know what bounds are.
23:28
<@McMartin>
In any sane language, yes.
23:29
<@McMartin>
So, you know, don't bother with the static test at all, just make range checking something the compiler does.
23:29
<@MyCatVerbs>
Right, but we can express the bounds at type-level.
23:29
<@McMartin>
It is excruciatingly ugly if you do that for real.
23:29
<@McMartin>
Standard Pascal did.
23:29
<@McMartin>
bounds should be object level.
23:29
<@McMartin>
It turns out to often be useful to have objects like arrays be resizable.
23:30
<@MyCatVerbs>
*thinks* Yes, but standard pascal's types were all monomorphic.
23:30
<@McMartin>
Even if they weren't, it's not really worth the effort. You've looked at ESC and its friends, right?
23:30
<@MyCatVerbs>
Whereas a proof that, given an array of size n, you won't index any further than n elements into it, that's rather more tractable. You can do that quite happily with dependent types.
23:31
<@McMartin>
Type annotations are totally awesome for mixed-binary things, much as dataflow itself.
23:31
<@MyCatVerbs>
McMartin: no, link please?
23:31
<@McMartin>
(Linux adapted it for kernel vs. user pointers)
23:31
<@McMartin>
http://kind.ucd.ie/products/opensource/ESCJava2/
23:32
<@McMartin>
I'm not familiar with v2
23:32
<@McMartin>
v1 was state of the art when I was starting out and was a general pre/post condition engine.
23:32
<@MyCatVerbs>
McMartin: 'sides, you've exaggerated the problem a little bit. I don't necessarily have to prove that my program does anything interesting, just that it won't call dlsym() or exec() in a stupid manner, or smash its stack or heap.
23:33
<@McMartin>
Heap smashing is hard to prove in real programs too.
23:33
<@McMartin>
There is a fair amount of C that "deliberately smashes the heap" in ways that are guaranteed to always be safe
23:33
<@MyCatVerbs>
Example?
23:33
<@McMartin>
AFAIK the state of the art there is just to shadow memory as a whole.
23:34
<@McMartin>
Example: When coding for the 286, linked lists and similar structures would, instead of storing a pointer to the next element, store instead the pointer difference between themself and the next element.
23:34
<@McMartin>
This is because on non-general-purpose architectures like the 286, doing so let you do checks on that next element's values without completely trashing your register array.
23:34
<@MyCatVerbs>
Ewwwww.
23:34
<@MyCatVerbs>
Clever, though.
23:35
<@McMartin>
It's one of the reasons ptr_diff_t exists as its own type in ANSI C.
23:35
<@McMartin>
So you can do all kinds of wacky semantic tricks to try to show that any such ptr_diff_t produces a valid value under condition X-Y-Z...
23:35
<@McMartin>
... or, more realistically, you tell Valgrind to keep a set of ranges of "these values are OK to dereference"
23:36
<@McMartin>
Custom allocators and the like will also occasionally build up pointer values mathematically from first principles
23:36
<@McMartin>
Which is even worse, but it's a custom allocator so you have to build your logic system around that.
23:36
<@MyCatVerbs>
Doesn't Valgrind keep up anyway? It just knows which locations you've deliberately mmap()'d or malloc()'d and works its way up from there, right?
23:36
<@McMartin>
Yeah, that's what I mean by "shadowing memory"
23:37
<@McMartin>
And it generally knows explicitly about malloc(), hooking the calls to mark the memory appropriately.
23:37
<@MyCatVerbs>
Ah, I thought you meant a static analysis technique there.
23:37
<@McMartin>
I'm offering these up as "things that it would be unreasonable to attempt statically"
23:38
<@McMartin>
Although, of course, malloc() is one of those routines that interacts with the outside world, which is usually your marker for static going bye-bye.
23:38
<@McMartin>
[* I'm counting systematic simulation, even if vaguely abstracted, as "dynamic" here.]
23:39
<@McMartin>
[** Yes, this is a point of major contention. I have a talk on this very topic I'm giving at USENIX next week.]
23:40
<@McMartin>
(In a completely different domain)
23:40
<@MyCatVerbs>
k. I'm quibbling with "unreasonable", because I still think you'd be quite happy to prove that these isolated sections of your program doesn't smash the heap on a blackboard, and from there it's only a matter of formalizing the proof.
23:40
<@McMartin>
One of the big problems that hits analyses of this sort is the fact that you can take the address of anything.
23:41
<@MyCatVerbs>
I'm *not* going to quibble with "uneconomic", because when have you *ever* seen correctness demonstrations for programs that actually got deployed into production? Outside of firmware for medical implants, maybe.
23:41
<@McMartin>
Windows 2000's device drivers, actually.
23:41
<@McMartin>
That was that SLAM thing I mentioned before
23:41
<@MyCatVerbs>
Wow, really?
23:41
<@McMartin>
Yeah.
23:41
<@MyCatVerbs>
Why bother?
23:41
<@McMartin>
Have you noticed how everyone who likes stability still thinks 2k is the best OS MS ever made~
23:41
<@MyCatVerbs>
Mmmm.
23:42
<@McMartin>
More seriously, because the Driver Development Kit was already a restricted subset, so they required it be restricted more
23:42
<@MyCatVerbs>
Yeah but *glances at TF* half the damn time the fucking hardware is buggy too.
23:42
<@McMartin>
IIRC, among the restrictions were that while loops were forbidden
23:42
<@McMartin>
(That is, for any loop, you had to be able to guarantee trivially that it would execute at most N times)
23:43
<@McMartin>
(Once you have that the code is effectively loop free and you get a lot more freedom about what you can prove)
23:43
<@McMartin>
Also, drivers as a subset have much more restricted memory access
23:43
<@MyCatVerbs>
Good restriction, I guess.
23:43
<@McMartin>
You don't have to see a line like int f(int *a) { *a = 5; } and conclude "well, shit, now I have to find every call to f to make sure that its first parameter is not one of those ptr_diff_t things"
23:44
<@MyCatVerbs>
Not like there's anything you can't work around there. If you really need to do more work, just give the CPU back and then have another go next time the timer interrupt comes up.
23:44
<@McMartin>
I must assume that's what they did
23:44
<@MyCatVerbs>
In fact, "half the time" is pretty optimistic. Your proofs would get -really- convoluted if you had to submit proofs that your programs were correct in the face of common BIOS bugs.
23:44
<@McMartin>
I mean, also part of it was them saying "AHAHAHAHA WE ARE M$ YOU SHALL DO AS WE SAY IF YOU WANT YOUR HARDWARE TO RU ON ANYTHING"
23:45 * MyCatVerbs thinks.
23:45
<@MyCatVerbs>
Hee.
23:45
<@MyCatVerbs>
In theory, that isn't actually a competitive advantage, though.
23:45
<@McMartin>
It's not
23:45
<@McMartin>
You'll also note that people who *don't* like 2K tend to complain about compatibility~
23:45
<@MyCatVerbs>
Nono, I mean the fact that they can (to a certain extent) boss hardware vendors around.
23:46
<@McMartin>
However, re: advantage, that's what it *took*.
23:46
<@McMartin>
Not counting the other major example, of course
23:46
<@McMartin>
That being the JVM
23:46
<@McMartin>
.class files strictly speaking count as proof-carrying code.
23:46
<@MyCatVerbs>
After all, once they've said to the vendors that the shipped hardware must be sane so that stable drivers can be written for it, then everyone else can write stable drivers too because now that same hardware will be showing up as a commodity.
23:46
<@McMartin>
Aha.
23:47
<@MyCatVerbs>
(And it doesn't spontaneously get less sane once it's away from MS-land.)
23:47
<@McMartin>
Yeah, so, it turns out MS clearly wasn't as big as they thought, given how XP turned out~
23:47
<@MyCatVerbs>
(Excepting Winmodems.)
23:47
<@MyCatVerbs>
How do you mean?
23:48
<@McMartin>
I mean, they tried to blackmail the hardware guys, but "compatibility" was clearly a much bigger goal for XP than 2K's "stability"
23:48
<@McMartin>
Which reads to me like their blackmail backfired
23:49
<@McMartin>
That's just tea-leaf reading though
23:49
<@MyCatVerbs>
Pity. If they'd hardlined, I still think they'd've won.
23:49
<@McMartin>
They're getting owned by it again in IE8 =(
23:50
<@McMartin>
The SLAM guys and one other analysis group at MSR have actually given talks on getting their tech adopted.
23:50
<@McMartin>
The IE8 thing is separate, though
23:50
<@McMartin>
IE8 by default is trying to actually be, like, standards compliant.
23:50
<@MyCatVerbs>
Mmmm.
23:51
<@McMartin>
You may have heard that it's impossible to browse anything on Vista.
23:51
<@MyCatVerbs>
And it turns out that lots of webpages are written against IE7's version of "standards compliant". :)
23:51
<@McMartin>
Better. This is because 80% of the web sites out there say "Oh, look, this is IE! I shall deliberately break everything!"
23:51
<@MyCatVerbs>
IMO, the ideal solution is the bodge they proposed - add a "we_really_mean_it_give_us_standards" tag.
23:51
<@McMartin>
Because they didn't check IE version.
23:51
<@McMartin>
That's better than mine.
23:51
<@MyCatVerbs>
Oh yeah, that. Heh.
23:52 * MyCatVerbs thinks.
23:52
<@McMartin>
Which was that they rename IE8 to something like Teh Vistz0r Webotron
23:52
<@MyCatVerbs>
No it isn't.
23:52
<@McMartin>
And then the sites would go "we have NFI what this is so we'll give it the standards-compliant stuff and hope for the best"
23:52
<@McMartin>
Well, not literally Teh Vistz0r Webotron
23:52
<@MyCatVerbs>
Calling it "Teh Vistz0r Webotron" and disabling all the old-IE-specific shite in javascript would be precisely the ideal solution.
23:53
<@McMartin>
Oh.
23:53
<@McMartin>
AIUI IE8's default mode indeed disabled that
23:53
<@McMartin>
It had a "turn on old extensions switch" and they also added a "replicate all of IE7's bugs" switch too.
23:54
<@McMartin>
The problem there is that from a consumer standpoint, if FF and IE7 seem to look right and IE8 is horribly wrong, clearly the problem is with IE8, right?
23:55
<@MyCatVerbs>
All the old switching on browser vendor? getBrowserName() returns "Vistz0r" instead of "IE", so now all the sides that do case getBrowserName() of { IE -> doIE6SpecificHacks(); other -> assumeStandardsCompliance() } now more or less work, kinda. :)
23:55
<@McMartin>
Yeah, that's what I meant.
23:55
<@MyCatVerbs>
I insist that that's the ideal solution. :)
23:56
<@MyCatVerbs>
Hell, have it masquerade as Firefox if necessary, just get people to feed you standard HTML so that you can actually render it. :)
23:56
<@McMartin>
Well, there are almost certainly legal reasons to forbid that.
23:56
<@MyCatVerbs>
Hee. Nope.
23:57
<@MyCatVerbs>
Swipe some random piece of code from Mozilla's repos, embed it in a random place. It's now a derivative of Mozilla. Even though you're not allowed to call it "Mozilla", you are allowed to call it "Mozilla-derived".
23:58
<@MyCatVerbs>
Like how lots of distros aren't allowed to ship "Firefox", they have to ship "Firefox Community Edition", because they haven't gotten the Mozilla foundation's permission to use the trademark. (At least I think it's the foundation that they need to ask.)
--- Log closed Thu Jul 24 00:00:47 2008
code logs -> 2008 -> Wed, 23 Jul 2008< code.20080722.log - code.20080724.log >