Counting Lines of Unsafe
The first metric we're going to calculate is the number of lines of unsafe
in a crate. For this we don't need to run any of the later steps like type
checking or borrowck
, so we'll just use the AST generated after the compiler
finishes parsing. We can use the CompileController.after_parse
to
hook in after the compiler has finished parsing and expanding macros.
For the actual AST inspection, the syntax
crate provides an extremely handy
Visitor trait which will recursively visit each node in a Crate. So
that means creating an unsafe code counter will require:
- Creating a type which implements the Visitor trait, overriding just the
methods which inspect
Block
s andItem
s - In the
Visitor
impl store the Span for allunsafe
blocks we find - Set the
controller.after_parse.callback
to be a closure which invokes our customVisitor
, making sure the result gets saved back in ourCalls
struct - In the top level
main()
function, print out the results of ourunsafe
analysis
First, we'll create the custom Visitor
.
# #![feature(rustc_private)] # extern crate syntax; use syntax::ast::{Block, BlockCheckMode, Item, ItemKind, Unsafety}; use syntax::visit::{self, Visitor}; use syntax::ext::quote::rt::Span; pub struct UnsafeVisitor { unsafe_lines: Vec<Span>, } impl<'a> Visitor<'a> for UnsafeVisitor { fn visit_item(&mut self, item: &'a Item) { match item.node { ItemKind::Fn(_, Unsafety::Unsafe, ..) | ItemKind::Trait(Unsafety::Unsafe, ..) | ItemKind::DefaultImpl(Unsafety::Unsafe, ..) | ItemKind::Impl(Unsafety::Unsafe, ..) => { self.unsafe_lines.push(item.span.clone()); } _ => {}, } visit::walk_item(self, item); } fn visit_block(&mut self, block: &'a Block) { if let BlockCheckMode::Unsafe(_) = block.rules { self.unsafe_lines.push(block.span.clone()); } visit::walk_block(self, block); } } # fn main() {}
It took a while to skim through all the docs for Item and Block to
figure out how you can tell when something is unsafe
, but that's more tedious
than difficult. For now we're going to assume you'll never get an unsafe block
inside a function which is also defined as unsafe.
Next, let's define a couple data structures to store our metrics in. They're pretty boring as-is, just a couple Plain Ol' Data structs.
#[derive(Debug, Clone, Default)] pub struct Metrics { pub spans: Vec<Row>, pub total_lines: usize, } #[derive(Debug, Clone, Default)] pub struct Row { pub start: Location, pub end: Location, pub num_lines: usize, } #[derive(Debug, Clone, Default)] pub struct Location { pub filename: String, pub line: usize, pub col: usize, } # fn main() {}
To make things easier, I've pulled the calling of UnsafeVisitor
and resolving
all Spans
into line numbers and file locations into a helper function called
analyse_ast
which takes the AST and a CodeMap
and transforms it into a
Metrics
.
The easiest way to return information from the internals of rustc
to our top
level is by updating the internal state of our Calls
to hold the data to be
returned. Because the trait definition for CompilerCalls
doesn't ensure our
Calls
will outlive the running of rustc_driver
, we need to use a
Rc<RefCell<T>>
to satisfy the borrow checker at runtime instead. To do things
properly, Calls
will contain a Rc<RefCell<Option<Metrics>>>
. The type
definition looks quite intimidating, but basically it signifies that we'll only
have metrics after the compiler has run, and we wrap it in a Rc<RefCell<T>>
so it can be mutated by multiple entities (in this case, both rustc
and us).
# #![feature(rustc_private)] # #![feature(box_syntax)] # extern crate rustc; # extern crate rustc_driver; # extern crate getopts; # extern crate syntax; # use syntax::ast::Crate; # use syntax::codemap::CodeMap; # use rustc::session::Session; # use rustc_driver::driver::{CompileController, CompileState}; # use rustc_driver::{CompilerCalls, Compilation}; # use std::rc::Rc; # use std::cell::RefCell; # #[derive(Debug, Clone, Default)] # pub struct Metrics; # fn analyse_ast(_ast: &Crate, _codemap: &CodeMap) -> Metrics {unimplemented!()} #[derive(Default, Debug, Clone)] pub struct Calls { pub unsafe_metrics: Rc<RefCell<Option<Metrics>>>, } impl<'a> CompilerCalls<'a> for Calls { fn build_controller(&mut self, _: &Session, _: &getopts::Matches) -> CompileController<'a> { let mut controller = CompileController::basic(); controller.after_parse.stop = Compilation::Stop; let metrics = self.unsafe_metrics.clone(); controller.after_parse.callback = box move |compile_state: &mut CompileState| { let ast = compile_state.krate.as_ref().unwrap(); *metrics.borrow_mut() = Some(analyse_ast( ast, compile_state.session.codemap(), )); }; controller } } # fn main() {}
You can see that after the AST has been analysed we set metrics
(a pointer
to the unsafe_metrics
property inside Calls
) to be the result of the
analysis.
The helper function itself, analyse_ast()
, then just creates a visitor,
makes it visit the provided AST, then turns the result into a Metrics
for
our analyser.
# #![feature(rustc_private)] # #![allow(dead_code)] # extern crate syntax; # extern crate syntax_pos; # use syntax::codemap::CodeMap; # use syntax::ast::Crate; # use syntax::ext::quote::rt::Span; # use syntax_pos::Loc; # use syntax::visit::{self, Visitor}; fn analyse_ast(ast: &Crate, codemap: &CodeMap) -> Metrics { let mut visitor = UnsafeVisitor::new(); // analyse the crate visit::walk_crate(&mut visitor, ast); // then resolve spans to line numbers and locations let spans = visitor .unsafe_lines .iter() .map(|span| { let start = Location::from(codemap.lookup_char_pos(span.lo)); let end = Location::from(codemap.lookup_char_pos(span.hi)); let diff = end.line - start.line; Row { start: start, end: end, num_lines: if diff == 0 { 1 } else { diff }, } }) .collect(); Metrics { spans: spans, total_lines: codemap.count_lines(), } } # fn main() {} # #[derive(Default, Debug)] # struct UnsafeVisitor { unsafe_lines: Vec<Span> } # impl<'a> Visitor<'a> for UnsafeVisitor {} # impl UnsafeVisitor { fn new() -> UnsafeVisitor { UnsafeVisitor::default() }} # struct Metrics { spans: Vec<Row>, total_lines: usize } # struct Row { start: Location, end: Location, num_lines: usize } # struct Location { line: usize } # impl From<Loc> for Location { fn from(_other: Loc) -> Self { unimplemented!() } }