Filter Data Types- Find Percentages

One of the most frustrating things that can happen when importing data is having multiple data types within the same column. For example I recently had a price file sent to me that had hardcoded cost prices e.g. $4.99, mixed in with discounts off the trade e.g. 30%. The data looked something like this:

Capture

Even though this is an extremely inconvenient data format, there are ways of separating the percentage values from the dollar values (without searching for the $ or %).

The first step is to create a new column next to the “problem” column. This will act as your filter column. After you have a space for your new filter column you will need to create a formula that will distinguish %’s from $’s.

In our case the output will be “True” for percentages and “False” for everything else. In the below example our problem data is in column “A”, if your data is somewhere else on the spreadsheet this will need to be changed accordingly.

In the first cell type:

=cell(“format”,A1)

This will grab the format of the data for that particular cell.

On percentage cells this will return “Px..”, where x is the amount of visible decimal places. E.g. 80.234% will return P3.

We only need to check the first character on the left to work out if the cell is a percentage, so change the formula to this:

=left(cell(“format”,A1))

The data will eventually be filtered, so there are only two values required for the output.

=if(left(cell(“format”,A1),1),true,false)

If the cell is a percentage output true, if it isn’t output false. Drag this formula down the column next to the problem data and it should output true for % values and false for anything else.

After you have your true false column with each cell denoting the status of the cell next to it the data can be filtered.

Highlight your new formula column

Click the Data tab

Change the newly created filter on the column to true

You will now have only the % values visible if you need the filter value should be false.

The whole process can be see here:

Resolving #### in Excel

Occasionally at work colleagues will ask me what the hash symbols in their spreadsheets mean and if the data has disappeared. The data is still there it just cannot be displayed properly because it exceeds the width of the column.

The solution, resize the columns and make them wide enough and tall enough for the data to be displayed. Fortunately excel has an Autofit button that is just a few clicks away.

In the top left-hand corner of the spreadsheet there is an arrow pointing down and to the right. Clicking this arrow will highlight all of the cells on the spreadsheet. Alternatively you can hold down Ctrl and press A.

Excel Autofit Columns

Double click the edge of any of one the columns containing data within the column row, the data will autofit width ways.

Autofit Width

Re-select all of the cells with the same arrow button or with Ctrl and A. Double click one of the row edges.

Autofit Height Row

The keyboard-shortcut for resizing cells automatically is as follows:

-Ctrl + A

-Hold Ctrl and type O, C then A

 

Using Solver Add-In to work out the Sub-values of a Total

The solver add-in is a handy plugin that is used to fill in the blanks for formulas that would usually require trial and error to workout.  In this example I will be using it to evaluate which values within a column make up a user defined total.

Adding the Solver

  • Click on File then Options
    Solver options
  • Click on the Add-Ins row on the left-hand side
    Solver Add-In
  • Click Solver Add-in, then hit Go…
  • Tick the check-box for Solver Add-In when the list of Add-Ins available appears and click OK
    Solver
  • Solver will now be visible under the Data tab in the Analysis part of the tool-strip

Using the Solver

  • Leave a blank column next to the sub-values column
  • Create a SUMPRODUCT formula that will use your sub-value column and your blank column. E.g. =SUMPRODUCT(A2:A11, B2:B11)
    This tells excel to multiply (A1*B1) + (A2 * B2) + (A3 * B3) +…….(A11 * B11)
    solver sum product addin
    The formula will return a “0” at the moment because the second column must be blank.
  • Click on the data tab then click Solver
  • Set your Object to the address of your SUMIF cell

sumproduct solver addin

  • Make sure the second radio button is set to “Value Of:” and set the text-box to the total value figure.
  • This operation only requires one constraint, click “Add” and select the blank values column, in this example it is column B. Change the Operator sign to “bin” and click OK. Now only 1’s or 0’s are allowed in the second column to complete the SUMPRODUCT formula.
  • Click Solve, the add-in will then sift through all the potential matches according to your constraints and criteria to work out which values in Column B require a 1.
  • When the solver is finished you will get the below screen, click OK.
    Solver Finished
  • The results should have a 1 next to the corresponding Values make up the total 27.29.
    Solver Finished

Extending VLOOKUP with CHOOSE

One of the more repetitive tasks an Analyst gets stuck with on a daily basis is looking up values using more than one criterion. This can be fixed by creating a temporary “Combination” column or by using several flavours of formula.

For example if we wanted to look up the Cost of Sales against a sales person, except the only way to properly identify the sales person is by looking up the Cost against the firstname and surname columns.

The simplest remedy to this problem is to create an extra column that is a combination of both the first and last names of the person, thereby creating a new lookup column. This can be achieve by:

  • Inserting a new column at the front of the range data that you are searching; this is your new identifier column.Conditional Vlookup
  • The identifier column should consist of more than one column and should uniquely identify the data you are after. In this case it is a combination of the first and last name of the salesperson.Unique Identifier
  • The vlookup statement is now modified to use the new identifier columnConditional Vlookup

Alternatively you could use vlookup with choose. Choose returns an nth value of an array and has the following syntax:

=CHOOSE(index_num, value1, [value2],…)

Ordinarily choose will return one value from a list of values. For example:

=CHOOSE(4,”TESTING”,”THIS”,”FUNCTION”,”FOR”,”YOU) will return FOR, however if you provide multiple index numbers the function will return multiple values. This will change the syntax of the statement slightly. It will also return a range that can be subbed into your VLOOKUP formula.

=CHOOSE({1,2}, x,x)

Two indexes mean that the function will now create a two column table. Similarly if you wanted to create a three column table your syntax would look like this:

=CHOOSE({1,2,3},… ,… ,…)

To create our temporary search column in memory, part of the choose statement will need to be an array, because it is the identifying part of our vlookup statement, the array part will need to be the first value of the choose statement.

=CHOOSE({1,2}, B11:B16 & C11:C16, …)

The second column will be the value which will be returned after the correct row has been identified, which in this case is the cost which ranges from D11 to D16. So the formula for Choose will now look like this:

=CHOOSE({1,2},B11:B16 & C11:C16,D11:D16)

This will retrieve the below screenshot as a range (which will actually only output an “#N/A” in a cell).

After the range is subbed into your vlookup formula

=VLOOKUP(A4&B4,CHOOSE({1,2},B11:B16 & C11:C16,D11:D16),2,0)

Choose Vlookup Conditional

Hit Ctrl+Shift and Enter. This will place {} braces around the formula and indicate to excel that this is an array formula.

{=VLOOKUP(A4&B4,CHOOSE({1,2},B11:B16 & C11:C16, D11:D16),2,0)}

The only reason the array formula (B11:B16 & C11:C16) is required in this circumstance is because a resultant column needed to be created from Columns B and C.

Conditional Formatting Formula Rules

Conditional Formatting with Formula Rules

There are several pre-defined options for highlighting ranges, Greater than, Less Than, Between… however if we want to define more than one criteria for highlighting cells, then you may want to define a Formatting Rule.

For example you may have a spreadsheet of students which you would like to highlight who have not submitted their assignments yet (X in Column E) and also whether the due date has passed(Due date in column E is less than < today); this is a task that will require the comparison of two columns and can be accomplished using a Formatting Rule using a Formula.

Conditional Formatting Example

To create a new formatting rule:

  • Highlight the range of cells you would like to apply your conditional formatting to.
  • Click on the home tab
  • Under style, click conditional formatting then click on New Rule.

Conditional Formatting Rule

  • A list of options will appear click “Use a formula to determine which cells to format”
    Conditional Rule
  • A valid formula for the scenario described above would be
    =AND($D2<NOW(),$E2<>”X”)

About the formula

There are three parts to this statement:$D2 < NOW(), $E2<>”X”, AND()

  1. $D2 < NOW()
    The first part of the statement $D2 < NOW() checks if the date in column D is before today, NOW() returns today’s date, so the first part reads if the date in column D is before today, then return True. 

    The $ in front of the columns means that for whichever cell you are running the formula for, the comparison must be made in column D. There is no $ in front of the 2 because the formula will need to iterate over the rows and the row index will need to change.

  2. $E2<>”X”
    The second part of the formula checks whether the value in column E is not equal to “X”, <> is used in excel as a not equal to operator.
  3. AND() compares two or more criteria, all of the criteria must be TRUE for the AND function to return TRUE. For exampleThe standard syntax for an AND() statement is the below:
    =AND(Condition1, Condition2,….)

    After our two criteria are wrapped in this logical function we get the original formula:
    =AND($D2<NOW(), $E2<>”X”)

    If the criteria had of required only one condition to be true we could have used the OR() conditional.

    This uses the same syntax as AND:
    =OR(Condition1, Condition2,….)

    For example if we wanted to highlight the students that had an assignment due date after the current day OR have already submitted their project, the following then OR() could have been used:
    =OR($D2>NOW(), $E2=”X” )

  • After the formula is set click Format and add fill effects or change the formatting to make the formula data stand out.
    Format Preview
    Formatting
  • Click apply and your  formula should take effect.
    formula working highlighting

 

Excel Using COUNTIF and COUNTIFS

COUNTIF

If you are counting cells and only require one set of criteria COUNTIF should be your first option.

The syntax is very straight forward:

=COUNTIF(range, criteria)

Where your range is the group of cells you are comparing and the criteria equates to True or False. If your criteria is true for the cell within your range it is added to the count.

In the below spreadsheet I can work out how many employees are absent using this formula:

=COUNTIF(B2:B11,”TRUE”)

COUNTIF FORMULA EXCEL

or count how many employees reached their target of $300 with this formula:

=COUNTIF(C2:C11,”>300″)

COUNTIF EXCEL FORMULA

COUNTIF also works with a variety of operators and wildcards:

-Works out which employees were absent using not equals

=COUNTIF(B2:B11,”<>FALSE”)

-Counts the employees that have 9 within their SalesID

=COUNTIF(A2:A11,”*9*)             

-Counts the employees that have an ID ending in 5

=COUNTIF(A2:A11,”*5)                               

-Counts the employees that have an ID with 2 in at the 3rd character and 5 at the 5th character.

=COUNTIF(A2:A11,”??2?5?”)

 

COUNTIFS

If you are counting cells that are based on more than one column and require more than one criteria the simplest formula for the job will probably be COUNTIFS.

The syntax is slightly different to COUNTIF, the only difference is it allows for multiple ranges and multiple criteria:

=COUNTIFS( criteria_range1, criteria1, criteria_range2, criteria2, .. criteria_range_n, criteria_n )

Using COUNTIFS I can now work out how many employees were not absent and reached their given target of 300 using:

=COUNTIFS(B2:B11,”FALSE”,C2:C11,”>300″)

COUNTIFS FORMULA EXCEL

COUNTIFS is subject to the same rules as COUNTIF with criteria and will allow wildcards and operators.

I can further filter the previous example using COUNTIFS. The below formula will count the number of employees that have:

  • An ID with 2 in at the 3rd character and 5 at the 5th character
  • Only employees not absent
  • Who had sales over $300

=COUNTIFS(A2:A11,”???2?5?”, B2:B11,”FALSE”,C2:C11,”>300″)

 

Excel Searching for values with VLOOKUP

Arguably the most helpful function Excel has to offer. Vlookup allows you to find any value you require within a table, providing your data is organised in vertical lists (the column headers are in the first row of the table).

The Vlookup tool allows you to search for a lookup field in one column and find its corresponding value on the same row, a column index then identifies where in that row the value will be “looked up” from.

The standard syntax looks something like this:

=VLOOKUP(Lookup Value , Table, Search Column, Exact Match or Approximate Match)

Worksheet1

Vlookup Search Values

Worksheet2

Search rows columns vlookup

 

For example if we wanted to lookup the Cost value from Worksheet2 and place it into Worksheet1, you would use the vlookup formula.

To get this formula started we need:

  • A unique identifier for each row. The hardest part of the formula is working out what uniquely identifies the data row you are searching for. In the this case it is the SalesID, which identifies the cost of a sale in both worksheet1 and worksheet2.
  • A range of cells/table to search within. You can use a range such as $A$2:$D$12 (the $ makes the row or column static, so if you drag the formula your range will not change).
    However if your table is dynamic and your list size is going to change, you are better off using columns to define your table. Providing you do not have separate tables below the current table. So the range in this example could be defined as $C:$E or $C$2:$E$12.
  • A column index indicating which column within our table we would like to retrieve our value from, which must be within the limit of our range. C-E is our column range which only provides us with 3 columns, Cost is the third column so this variable will be set to 3.
  • You can optionally include a true or false as your forth variable, this indicates whether you are searching for a partial match or an exact match. If you decide to use * wild cards in your  search criteria, you should use partial match, otherwise using  false for an exact match is advisable.

The result:
=VLOOKUP(C2,$C:$E,3,FALSE)

 

Excel Conditional Filtering

Excel provides a wealth of filters that are readily available under the current filter option, however sometimes in our need to cut and modify data we require a more precise method of filtering through the rows.

Number Filters Equals Formulas

If in the below screenshot I wanted to grab all the accounts that equal “546” and display only their Sales data, I could place a filter on the AccountID. However I am still left scrolling through a list of values to find “546” which is not ideal if the list has thousands of unique values.

Filtering Rows Columns Excel Formulas

 

To avoid this we can create a new column which displays “t” if the value in column A is equal to “546” and “f” if it doesn’t.

The basic conditional format is =IF(Condition, [value if true], [value if false])

  =IF(A2=”546″,”t”,”f”)

This formula simply states, if cell A2 is equal to “546” display “t” otherwise display “f“.

The condition must equate to a true or false value, if your condition returns a string of characters or anything other than a true/false value then your formula will return an error and an #NA will be displayed in the cell.

Double click on the right-hand corner of the first cell to populate the column with the formula.

Take off the Filter on the first row (if it is currently on) and reapply it (under the Data tab) so that there is a drop down box on the new column. You can now filter the values using the equality formula set against the rows.

Select “t using the newly created drop down box and your correctly filtered data will remain.

Filtering Formulas Excel Rows Tables Columns

By default your values will be of type general, however if you have pasted your data from an unknown source you may have different data types in your rows and columns.  When checking for equality between your criteria and the column data the data types should be the same. Otherwise your conditional may not evaluate to true.

You can change the data type of your cells by highlighting them, right-clicking and selecting “Format Cells”, if both your columns are set to the same type here, then your formula should work.

The formula can be easily modified to filter using more complicated requests.

If you wanted accounts which begin with 546, the formula would look like this:

=IF(LEFT(A2,3)=”546″,”t”,”f”)

If you were after only customers that contained the word “Anonymous” the formula could be modified to look like this:

=IF(ISNUMBER(SEARCH(“Anonymous”,B2)),”t”,”f”)

The criteria Search will return either an index indicating where the word was found or a value that cannot be displayed. The isnumber() criteria transforms this into a true or false statement that can be evaluated. So if search finds the word “Anonymous” in cell B2 it will return an index, that is a number resulting in a “t” being displayed.

The more text modification formulas you know the more powerful the filtering can become.

Excel Filtering Your Data

There are several ways to filter your data in excel. One of the more convenient and quicker methods is to use a filter.

For example if you are only after one particular customer’s sales within a spreadsheet you could place a filter on the Customer Name or whichever column allows you to uniquely identify your customer (If you have customers with the same name you may need to filter on a customer ID instead).

Filtering your data is a relatively straightforward process

  • Click in any of the populated cells within the first row to make the cell active.
  • The filter option can be location under the Data Tab.Filtering
  • Click filter, you will be given a drop down box allowing you to select from a unique set of values in each column. In the below screenshot, I can now choose to select unique customers.
    Filtering2

Please note, filtering the data will not remove any of it, it will only change how it is displayed. So if you are planning on using a formula that is usually used to calculate totals such as Sum(), you may pick up data that is not displayed. In that particular circumstance you should use a formula that will only work on the filtered data sum as Subtotal(9,C2:C11).

Filtering4

Excel Formulas A Beginners Guide

One of the most useful benefits Excel provides is a straight-forward way of calculating data. In the below example I will be demonstrating how to create a margin formula which will be recalculated on each row.

By default formula calculations are set to automatic, however sometimes they can be switched off. To double-check they are turned on, in the tool strip click “Formulas”, “Calculation”, “Calculation Options” and click “Automatic”.

formulas

Click on the first blank cell of the column you would like to populate with your formula.

Press = to change the cell to formula mode. When the cell is in formula mode if you click on another cell it will populate your active cell with that address.

.formulas2

You can also type your formula if it is more convenient =C2-D2

After you are satisfied with your formula hit enter and your first cell will be populated with the resultant value.

To copy this formula into the rest of the column, click on the bottom right-hand corner of your new formula cell and drag down to populate the column. Alternatively if you know that you want to populate the whole column, double-click on the right-hand corner and excel will populate all of the cells available until Column D has a blank space (which in the above example is D8).

If you now click on one of the populated cells you can see that the formula has changed with the cell index. D4 has the formula “=C4-D4”.

formulas3

After you are happy with how the data is displayed and are confident that it will stay static (e.g. you know that the sales for LargeData will stay at 5645)  it is good practice to remove the formula from the cells. At the moment if the Sales or Cost data is modified the Margin data will be too, which may be beneficial if your data is going to change.

Click the E column to highlight it and press Ctrl and C to copy the data to the clipboard.

formulas4

Right-click in the first cell of the column and click “paste special.

formulas45jpgformulas7

Change from the default option “All” to “Values”. Now only the numerical value will remain.