Relevant Code Searches, Wildcard/Case-sensitive Searches, and More with GitSense for Bitbucket Server

If you have never thought about creating a scaleable code search solution before, i.e. not something built around grep; it may not be obvious, but it's really difficult.  If you have a basic understanding of combinatorics, you can understand why.  Given current technology, if you want to create a scalable code search solution, you can only optimize it, to do one of the following, but not both.

  • Provide "search discovery", where you can search all your repos at once, but not all your code.
  • Provide "search relevance", where you can search all of your code, but not all of them at once.

Bitbucket Server, along with GitLab and GitHub, optimized their search solutions for "search discovery", while GitSense, optimized for "search relevance".  Both approaches, has their pros and cons, and in this blog, we'll quickly highlight how GitSense, brings "search relevance" to Bitbucket Server, along with other, uniquely GitSense, search enhancements. 

If you would like to install GitSense for Bitbucket Server, you can download both the GitSense Server and Bitbucket addon-on for free.  And if would like to play around with the add-on, you can do so at http://bitbucket-server-demo.gitsense.com.  If you are learning about GitSense for the first time, we recommend you read our previous blog posts, to learn more.

How GitSense Makes Code Searching Better

  1. Relevant Code Searches
  2. Predictable Matches
  3. Wildcard, Case-sensitive and CamelCase Searches
  4. Commit Traceable
  5. Code Tree

1. Relevant Code Searches

When you search for code with Bitbucket Server and by extension GitLab and GitHub, you can only ever match, what exists on the repos default branch.  This means, any matches found may or may not be relevant to what you are looking for.  This is why, you might get no matches, when you know the code exists.  Or why the results returned, was not what you were expecting.

With GitSense, search relevance is never an issue, since you can search any branch, from any repo, and in any combination.  However, search relevance does have its downside, as it's not designed for searching millions of repos at once.  For the best code searching experience, you'll probably want to search with both GitSense and Bitbucket Server, since they are both optimized, for two different use cases.

2. Predictable Matches

When it comes to queries, GitSense tries to its best, to honour what you are looking for.  For example, if you click on the following links, a code search for "target.present", will be executed with both GitSense and Bitbucket Server.

GitSense
http://bitbucket-server-demo.gitsense.com/plugins/servlet/gitsense/ACME/gitlab-ee#b=bitbucket-server:ACME/gitlab-ee:master&q=target.present&t=code

Bitbucket Server
http://bitbucket-server-demo.gitsense.com/plugins/servlet/search?q=project%3AACME%20repo%3Agitlab-ee%20target.present

And as you can tell by the results, there is a big difference with how GitSense and Bitbucket, both treat the query.  With GitSense, it knows you want to find something that best matches "target", followed by a "dot", then followed by "present".  And it constructs a query that best represents this, which is why only 2 matches were returned.

With Bitbucket, it translated the query to "target AND present", which is why 77 matches were returned.  If you want to match "target.present" with Bitbucket, you'll have to quote it, like such:

http://bitbucket-server-demo.gitsense.com/plugins/servlet/search?q=project%3AACME%20repo%3Agitlab-ee%20%22target.present%22

which can be inconvenient, since a common use case for code searching, is to cut and paste search queries.  This means if you paste "target.present", you will have to manually quote it, to get the desired result.

Another example of how GitSense and Bitbucket, differ in their approach to code searching, is it doesn't match camelCase words by default.  For example, if you click on the following links, a code search for "max", will be executed with both GitSense and Bitbucket Server.

GitSense
http://bitbucket-server-demo.gitsense.com/plugins/servlet/gitsense/BIG/go#b=bitbucket-server:BIG/go:master&q=max&t=code

Bitbucket Server
http://bitbucket-server-demo.gitsense.com/plugins/servlet/search?q=project%3ABIG%20repo%3Ago%20max

And as you can tell by the results, GitSense doesn't match "max" in camelCase words.  With GitSense, we try not to make assumptions about what you want to match, which is why camelCase words with "max", are not matched by default.  If you want to match "max" in a camelCase word, you can easily do so, by enabling camelCase matching, as shown in the screen shot below.

3. Wildcard, Case-sensitive, and CamelCase Searches

If you don't quite know exactly what to search for, or if you need to further refine or broaden your search scope, GitSense provides the following handy search options.

Trailing Wildcard Matches

To perform a trailing wildcard search, add an asterisk (*) to your search term.  For example:

foo* => will match foobar, foot, etc.

However, the asterisk must be preceeded by 3 characters.

fo* => too short
f*  => too short

Case-sensitive

By default, code searches are case-insensitive.  To enable case-sensitive searches, click the checkbox beside the "Case-sensitive" option, as shown below.

CamelCase Matching

As explained in the previous section, GitSense doesn't match compounded camelCase words by defaut, and we do this to reduce search noise.  For when you do want to search camelCase words, you can easily do so, by enabling it, as shown below.

4. Commit Traceable

As the screenshots above shows, both Bitbucket (left) and GitSense (right), can both tell you what files were matched and on what line, but only GitSense can tell you, who touched the matching file last and when.  Mapping search results, not only to a file, but to a commit as well, is unique to GitSense and can come in handy, if you need to sift through a lot of matches.

For example, if you know the code you are looking for, was updated by a certain user, you could filter the search results, to only include their changes.  Or if you know the code hasn't been touched for a long time, you could use the date information to further filter the list.  And so on. 

GitSense is all about improving developer productivity, and although this additional information will probably not be used, very often; it is one of those things, that can come in handy, when it is needed.

5. Code Tree

Finding that needle in a haystack, just got a whole lot easier.  With the GitSense "code tree", you'll be able to quickly and easily find that piece of code you know exists, but just can't quite remember where.  With this tree, you can quickly and easily drill into your search results by programming language, directory and file.

The End

Well this brings us to an end of another blog post, and as usual, we hope it was informative.  And as you can see, with GitSense, code searching in Bitbucket Server, just got a whole lot better.  And it's worth noting, we are just getting started, as we have other search improvements in the pipeline, so tuned to learn more.

Blog Posts

© 2016 SDE Solutions, Inc. All rights reserved.