January 19, 2012
Stuff I learned while writing a CTF
This blog entry talks about some of the lessons I learned running the WebHacking class for Infiltrate 2012 which included a WarGame/CTF style hootenanny on the final day.
To be clear, I didn’t write the entire thing myself, I had a ton of help. Many Immunity folks contributed to this class in their spare time while also doing consulting or other work. So high fives to the following hombres in alpha order: Admin Team (Carissa and Vanessa), Chris, Dami, Dave, Justin, Leonardo, Matias, Mark, Miguel and Nico
Word Count: ~2000
How the class worked:
WebHacking was a three day course, days one and two were lecture and exercises while day three was practical. During days one and two we tried to have as many exercises as we could, we wanted students to keep their fingers on the keyboards as much as possible. Our goal was something like 5 slides of content to every exercise. When completing most exercises students would receive a token which they could redeem for points. On day three we had challenges/puzzles that covered the content from days one and two, when completed students would also receive a token.
Dami, our Django sensei, implemented a web application scoring server. The scoring server handled cryptographically “secure” token generation via a key that only lived on the scoring server. Students would also submit their tokens to the server which would then track scores and provide fancy graphing. The server itself had a good bit of functionality, including the ability for students to submit links for us to click on. As you can probably already guess that setup produced some unintended lulz.
We kept the attitude that if the students were able to break something in a way we had not intended then that was awesome.
CTF/WarGame both imply some type of defense when in this situation there was none. However, security people seem to be really keen on these terms and they’re universally understood as roughly describing what we did on day three. I’ll use challenge and puzzle interchangeably to describe day three activities as I think they are more fitting terms.
1) I had all the students agree that they wouldn’t tamper with the challenges in a meaningful way, if they wanted to tag their name on a page that was fine, but all challenges had to remain solvable. Breaking an exercise meant a deduction in points.
2) No network level attacks, no physical attacks on devices or people, no attacking the scoring server or the resource server (hosting files for download), no attacking VM hosts, no DoS.
3) Outside tools were allowed within reason, I think our general rule was they had to be open source.
This would work better as two separate courses, an introduction and an intermediate level. A lion’s share of the current content probably fits the introductory level. Which I’m ok with, but we want to facilitate students being matched with the class that best suits their skillset. That said going from an introduction to XSS to padding oracle was kind of jarring.
You’ll note there’s no mention of Ajax, JSON or SOAP or framework specific issues (i.e. JSP, .NET). I’m not sure where that would all fit in while keeping the class a manageable length. That content is super important but I don’t want to get stuck out in the weeds. This is the general direction I’d like the class to evolve:
The introductory class should be 4 days:
[Optional] Day 1 – Python Introduction, Linux fundamentals, Web fundamentals
[Required] Day 2 – OSIG, versioning, light auditing, XSS/XSRF, RFI/LFI, Intro to SQLi
[Required] Day 3 – Scripting up repetitive tasks, SQLi, light privilege escalation
[Required] Day 4 – Review, Puzzles/Challenges
The intermediate class should also be 4 days:
[Required] Day 2 – SQLi (Optimizing blind SQLi, NLTK, Unicode)
[Required] Day 3 – Padding Oracle, in depth privilege escalation, anti-forensics
[Required] Day 4 – Review, Puzzles/Challenges
Hackers can be very competitive:
I’m not particularly competitive by nature so in retrospect I severely underestimated how much student’s scores would matter to them. Students took their scores very seriously and I had multiple folks approach me during the class to be sure they had completed every possible points scoring exercise. I wish we had paid more attention to the number of points we assigned to each puzzle and made the point information available to the students. For a while it was looking like we were going to have a tie and we hadn’t planned on that, so a more clearly defined rule set around how different scoring scenarios would be handled would have been good.
We assigned points to challenges roughly based on how long we thought it would take us to beat the challenge if we’d never seen it before. Challenges that would take us an hour were hard, challenges that would take us 15 minutes were easy and scored as such. I think we should have weighed how things were scored a bit more, including the discretionary tokens we gave out when a student was especially clever. If anything scores were a little low, hard exercises should have been ~20 points and discretionary tokens should have been capped at 5 points. We also needed enough exercises such that a student who hadn’t gotten discretionary tokens could come back from that disadvantage.
We made use of an open source Python implementation of a diff utility provided by Google. When we intended the students to RTFM (i.e. to explore a new API), we should have made that intention way more clear. We had more than a few quizzical looks and questions about why the relevant API documentation wasn’t laid out in the slides. RTFM is one of those things you have to get used to in this industry but people don’t expect it in a class. Our fault on that one.
PHP is a great language to write this stuff in:
PHP is very useful because the level of knowledge you need to get a page up and running quickly is low. It’s also useful in our scenario because it’s awesomely easy to write vulnerable apps. Which is to say it’s easy to fuck up PHP. Meaning, I get excited when I see a PHP app on a gig. Therefore, please write more PHP.
Number of exercises/challenges is key:
If you’re running challenges/puzzles for people who have had hands on penetration testing experience, you need a lot of content. Take how many challenges you think you’ll need, now double it and add 10 to that number. All of our students were smart, most had some offense experience, a few were studs. The gentleman who ended up winning had about 30 minutes of time left after completing all our challenges minus one. Ideally I would have preferred he had several more exercises to choose from to keep him occupied. Counting the exercises from lecture, I think we had around 50 total exercises/challenges/puzzles for students to complete.
In our Introduction to Python section we had the students create a basic brute force script as the culminating exercise. As a bonus exercise those students who were comfortable enough with Python to skip our review had the option of solving additional problems to incorporate more features into their script.
Exercises/challenges should be functional apps whenever possible:
I created a small web application for the course that functioned as a simple RFC lookup app. You could give it an RFC number and it would spit out the corresponding RFC text. It’s simple enough to illustrate the bug we wanted without forcing the students to bug hunt too long, but it had enough functionality to give the impression that you might find it on a gig. Your application has to do something more than just be vulnerable. The attention to little details like using free templates, customizing CSS, etc give the class a more polished and satisfying feeling.
Holy VMs Batman:
We made the decision early on that the VMs would be instructor controlled and the students would receive plain Linux laptops. I think at final count we had over 20 VMs that powered the entire class. On our next class iteration we’ll undoubtedly have 10-15 more VMs. We had five instructor laptops running the class at the start and we ended up with one more when Matias rotated in. Things got a bit crowded on our table and we were always tripping over each other to get to the VM we needed. A minirack may be the way to go next year, the White Wolf Security guys have used this setup to good success. Talking with our IT guys it seems like we can put together a portable 14U rack with networking, power and three servers for under $10k.
Make a standard VM:
Most of our VMs were a Ubuntu 11.10 server based LAMP stack, all the exercises were written to run on that platform (with few exceptions) so if one VM died we could take 10 minutes and port an exercise over to another laptop or VM if needed. Don’t forget to change the VM’s MAC in VMWare if you’re just doing a straight copy. Daily snapshots are another good step towards winning.
Have detailed install instructions:
I had a bunch of folks helping out by writing exercises. One of my rules from the get-go was that I had to be able to install your exercise in under 5 minutes. If your exercise required a DB, you needed to provide me either a SQL dump or a .py to populate the DB. I needed complete set up instructions including all the package names for your dependencies and anything that had to be compiled from source. We definitely had a few things die on us during or before class, so having this information handy let us manage that crisis pretty readily. Solutions in the form of Python scripts (where appropriate) were also required so we could easily test and spot problems in the installed exercise.
Don’t use Wireless:
It is a ridiculous pain in the ass to fix and debug. Especially if your channel space is really crowded. If you’re using Linux laptops the wireless drivers, utilities and options are confusing for mere mortals. Bring two thick rolls of gaff tape, enough CAT5 to rig Carnegie Hall, a switch, and call it done. We also learned the hard way that you need to contract the hotel’s IT to have your port live, labeled and configured to your spec two days before your class starts.
Stuck on creating exercises?
Writing apps around a particular vulnerability class can be tough, afterall there are only so many ways to write up command injection. No problemo! Head on over to exploit-db, install an appropriately licensed vulnerable application but tweak it enough such that the exploit doesn’t work out of the box. This can be as simple as adjusting the install path, mangling the version string, tweaking the app so that whatever sanity checks the exploit does will fail, or in some cases removing a dependency to break the app somewhat.
Nerds love nerdy culture references:
This is always a big hit. A little laughter in unexpected places can relieve stress for students, the key is balancing your use of that device such that it doesn’t turn into a VH1 nostalgia marathon. Over all our content I think we referenced: Muppets, Futurama, Ghost in the Shell, Seasame Street, number theory multiple times, Jurassic Park, The Matrix, lolcats (sparingly), 90s era rap, Batman, Monty Python and a few more that I’m obviously forgetting. Security can be a humorless industry (have you read a NIST document?) so having a bit of fanservice and giggles will make your class all the better.
Some things to bring with you: spare gaff tape, duct tape, multi-tool, scissors, box cutter, cross over cable, electronics screw drivers, CAT5 crimp tool, Aspirin, Advil, Immodium AD, band-aids, fat sharpie, 2-3 screw drivers with multiple head attachments, $10 in quarters for caffeine, label maker, Cyanide capsules in case of capture.
This post was brought to you in part by: Mazzy Star – Fade Into You